LJJ <
> tf
^ DQ
OU 164189 >m
OSMANIA 'UNIVERSITY LIBRARY
Call No. S'/tf- £ H£( ccession N
Tltle 9
This book should be returned on or before the^late
last marked below.
THE AXfOMATJC METHOD
STUDIES IN LOGIC
AND
THE FOUNDATIONS OF
MATHEMATICS
L. E. J. BROUWER
E. W. BETH
A. HEYTING
Editors
1959
NORTH-HOLLAND PUBLISHING COMPANY
AMSTERDAM
THE AXIOMATIC METHOD
WITH SPECIAL REFERENCE TO GEOMETRY
AND PHYSICS
Proceedings of an International Symposium held at the
University of California, Berkeley, December 26, 1957 — January 4, 1958
Edited by
LEON HENKIN
Professor of Mathematics , University of California, Berkeley
PATRICK SUPPES
Associate Profexxor of Philosophy, Stanford University
ALFRED TARSKT
Profewor of Mathematics and Jtesearch Professor, University
of California, Berkeley
1S).r>9
NOUTH-HOUiAN I) PU I'.LISHING COMPANY
AMSTKUDAM
No part of this book may be reproduced
in any form by print, microfilm or any
other means without written permission
from the publisher
PRINTED IN THK NETHERLANDS
CONTENTS
PREFACE VII
PART I. FOUNDATIONS OF GEOMETRY
DIE MANNIGFALTIGKEIT DER DIREKTIVEN FUR DIE GKSTALTUNG GEOMETRI-
SCHER AXIOMENSYSTEME. Paul Bernays ............... 1
WHAT is ELEMENTARY GEOMETRY ? Alfred Tarski ............ 16
SOME METAMATHEMATICAL PROBLEMS CONCERNING ELEMENTARY HYPERBOLIC
GEOMETRY. Wanda Szmielew .................... 30
DIMENSION IN ELEMENTARY EUCLIDEAN GEOMETRY. Dana Scott ...... 53
BINARY RELATIONS AS PRIMITIVE NOTIONS IN ELEMENTARY GEOMETRY.
Raphael M. Robinson ....................... 68
REMARKS ON PRIMITIVE NOTIONS FOR ELEMENTARY EUCLIDEAN AND NON-
KlTCLIDEAN PLANE GEOMETRY. H. L. Royden ............. 86
DIRECT INTRODUCTION OF WEIERSTRASS HOMOGENEOUS COORDINATES IN
THE HYPERBOLIC PLANE, ON THE BASIS OF THE ENDCALCULUS OF HlLBERT.
Paul Szasz ............................ 97
AXIOMATISCHER AUFBAU DER EBENEN ABSOLUTEN GEOMETRIE.
Bachmann ............................ 114
NEW METRIC POSTULATES FOR ELLIPTIC M-SPACE. Leonard M. Blumenthal . . 127
AXIOMS FOR GEODESICS AND THEIR IMPLICATIONS. Herbert Busemann . . . 146
AXIOMS FOR INTUITIONISTIC PLANE AFFINE GEOMETRY. A. Heyting ..... 160
GRUNDLAGEN DER GEOMETRIE VOM STANDPUNKT DER ALLGEMEINEN TOPO-
LOGIE AUS. Karol Borsuk ...................... 174
LATTICE-THEORETIC APPROACH TO PROJECTIVE AND AFFINE GEOMETRY. Bjarni
J6nsson ............................. 188
CONVENTIONALISM IN GEOMETRY. Adolf Griinbaum ........... 204
VI CONTENTS
PART II. FOUNDATIONS OF PHYSICS
How MUCH RIGOR is POSSIBLE IN PHYSICS ? Percy W. Bridgman 225
LA FINITUDE EN MECANIQUE CLASSIQUE, SES AXIOMES ET LEURS IMPLICATIONS.
Alexandra Froda 238
THE FOUNDATIONS OF RIGID BODY MECHANICS AND THE DERIVATION OF ITS
LAWS FROM THOSE OF PARTICLE MECHANICS. ErilCSt W. Adams 250
THE FOUNDATIONS OF CLASSICAL MECHANICS IN THE LIGHT OF RECENT AD-
VANCES IN CONTINUUM MECHANICS. Walter Noll 266
ZUR AXIOMATISIERUNG DER MECHANiK. Hans Hermes 282
AXIOMS FOR RELATIVISTIC KINEMATICS WITH OR WITHOUT PARITY. Patrick
Suppes 29 1
AXIOMS FOR COSMOLOGY. A. G. Walker 308
AXIOMATIC METHOD AND THEORY OF RELATIVITY. EQUIVALENT OBSERVERS AND
SPECIAL PRINCIPLE OF RELATIVITY. Yoshio Ueno 322
ON THE FOUNDATIONS OF QUANTUM MECHANICS. Herman Rubin 333
THE MATHEMATICAL MEANING OF OPERATIONALISM IN QUANTUM MECHANICS.
I.E. Segal 341
QUANTUM THEORY FROM NON-QUANTAL POSTULATES. Alfred Lande .... 353
QUANTENLOGIK UND DAS KOMMUTATiVE GissETZ. Pascual Jordan 365
LOGICAL STRUCTURE OF PHYSICAL THEORIES. Paulette F6vrier 376
PHYSICO-LOGICAL PROBLEMS. J. L. Dcstouches 390
PART III. GENERAL PROBLEMS AND APPLICATIONS
OF THE AXIOMATIC METHOD
STUDIES IN THE FOUNDATIONS OF GENETICS. J. H. Woodger 408
AXIOMATIZING A SCIENTIFIC SYSTEM BY AXIOMS IN THE FORM OF IDENTIFI-
CATIONS. R. B. Braithwaite 429
DEFINABLE TERMS AND PRIMITIVES IN AXIOM SYSTEMS. Herbert A. Simon . . 443
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS. Karl Menger 454
AXIOMATICS AND THE DEVELOPMENT OF CREATIVE TALENT. R. L. Wilder . . 474
PREFACE
The thirty-three papers in this volume constitute the proceedings of
an international symposium on The axiomatic method, with special reference to
geometry and physics. This symposium was held on the Berkeley campus
of the University of California during the period from December 26, 1957
to January 4, 1958.
The volume naturally divides into three parts. Part I consists of fourteen
papers on the foundations of geometry, Part II of fourteen papers
on the foundations of physics, and Part III of five papers on general
problems and applications of the axiomatic method. General differ-
ences between the character of the papers in Part I and those in
Part II reflect the relative state of development of the axiomatic method
in geometry and in physics. Indeed, one of the important aims of the
symposium was precisely to confront two disciplines in which the pattern
of axiomatic development has been so markedly different.
Geometry, as is well known, is the science in which, more than 2300
years ago, the axiomatic method originated. Work on the axiomatization
of geometry was greatly stimulated, and our conception of the significance
and scope of the axiomatic method itself was greatly expanded, through
the construction of non-Euclidean geometries in the first half of the
nineteenth century. By the turn of the century we find for the first time,
in the works of men like Pasch, Peano, Pieri, and Hilbert, axiomatic
treatments of geometry which both are complete and meet the exacting
standards of contemporary methodology of the deductive sciences. Since
that time there has been a continuous and accelerating development of the
subject, so that at present, all of the important geometric theories
have been axiomatized, and new theories have been created through
changes introduced into various systems of axioms; for most theories
a variety of axiom systems is available conforming to varying ideals
which have been pursued in connection with the axiomatization of
geometry. Most recently, building upon the work on axiomatization,
it has become possible to formalize geometrical theories, and in
consequence geometrical theories themselves have been made the
object of exact investigation by metamathematical methods, leading
to several new kinds of results. The present volume contains new contri-
VIII PREFACE
butions to many of the directions in which studies in the foundations
of geometry have been developing.
Axiomatic work in the foundation of physics has had a more checkered
history. Newton's Principia, first published in 1687, emulated Euclid's
Elements, but the eighteenth and nineteenth centuries did not witness a
development of axiomatic methods in physics at all comparable to that
in geometry. Even the work in this century on axiomatizing various
branches of physics has been relatively slight in comparison with the
massive mathematical development of geometry. There is not to our
knowledge a single treatise on classical mechanics which compares in
axiomatic precision with such a work as Hilbert's well-known text on the
foundations of geometry; furthermore, the axiomatic treatments of
various branches of physics which have been attempted, including those
presented in this volume, do not yet have the finished and complete
character typical of geometrical axiomatizations. Much foundational
work in physics is still of the programmatic sort, and it is possible to
maintain that the status of axiomatic investigations in physics is not
yet past the preliminary stage of philosophical discussion expressing
doubt as to its purpose and usefulness. In spite of such doubts, an in-
creasing effort is being made to apply axiomatic methods in physics, and
many of the papers in Part II indicate how exact mathematical methods
may be brought to bear on problems in the foundations of physics. To the
knowledge of the Editors the papers in Part II constitute the first
collection whose aim is specifically to provide an over-all perspective of
the application of axiomatic methods in physics. It is our candid hope
that this book will be a stimulus to further work in this important domain.
An attempt has been made to give coherence to the volume by grouping
papers according to their subject. Part I begins with a paper by Bernays
which surveys the main tendencies manifesting themselves in the con-
struction of geometrical axiom systems. In the five papers which follow,
metamathematical notions referring to the axiomatic foundations of
various systems of Euclidean and non-Euclidean geometry are discussed,
and to a large extent specific metamathematical methods are applied.
The first three papers, namely those by Tarski, Szmielew, and Scott, are
concerned with problems of completeness and decidability, while the
remaining two, those by Robinson and Royden, deal with problems of
definability. The next five papers set forth new axiomatizations of various
branches of geometry. Szasz is concerned with hyperbolic geometry,
Bachmann with absolute geometry, Blumenthal with elliptic geometry,
PREFACE IX
Busemann with metric differential geometry, and Heyting with affine
geometry; the last of these authors approaches the subject from the
intuitionistic point of view. In the following two papers connections
between the foundations of geometry and some related branches of
mathematics are studied. In particular, Borsuk examines Euclidean
geometry from the standpoint of topology, and Jonsson surveys pro-
jective and affine geometry from the standpoint of lattice theory. The
last paper of Part I, that of Griinbaum, deals with the philosophical
problem of conventionalism in geometry.
In the case of Part II, the order has, roughly speaking, followed the
historical development of physics. The opening paper of Bridgman
analyzes the general notion of rigor in physics. It is followed by four
papers on the axiomatic foundations of classical mechanics. Froda
considers particle mechanics, Adams rigid body mechanics and Noll
continuum mechanics; Hermes analyzes certain axiomatic problems
surrounding the notion of mass. Three papers on relativity follow. Suppes
deals with relativistic kinematics, Walker with relativistic cosmology
and Ueno with relativity theory as based on the concept of equivalent
observers. Next come three papers on quantum mechanics. Rubin
considers quantum mechanics from the standpoint of the theory of
stochastic processes; Segal examines the mathematical meaning of
operational ism in quantum mechanics; and Lande approaches the subject
on the basis of non-quantal postulates. Finally, there are three papers
which deal with relations between logic and physics. Jordan considers
quantum logic and the commutative law; Fcvrier the logical structure
of physical theories ; and Dest ouches the theory of prediction with special
reference to physico-logical problems.
The arrangement of papers in Part III is somewhat arbitrary. Loosely
speaking, the papers move from more specific to more general topics.
Woodger is concerned with the foundations of genetics ; Braithwaite with
scientific theories whose axioms take the form of identities; Simon with
primitive and definable terms in axiom systems ; Menger with the general
theory of functions in the context of the empirical sciences ; and Wilder
with the potentiality of the axiomatic approach as a method of teaching.
It goes without saying that each author is solely responsible for the
content of his paper. The Editors have confined themselves to arranging
the volume and handling various technical matters relating to publication.
In particular, the choice of notation and symbolism has been left to the
individual author.
X PREFACE
The calendar of the scientific sessions was as follows:
December 26, afternoon. Opening remarks by Acting Chancellor James
D. Hart of the University of California, Berkeley, and by Professor
Alfred Tarski of the same University. Section I, Professor Paul Bernays
(Zurich, Switzerland). Section II, Professor P. W. Bridgman (Cambridge,
Massachusetts, U.S.A.).
December 27, morning. Section II, Professor Hans Hermes (Minister,
Germany), Professor Walter Noll (Pittsburgh, Pennsylvania, U.S.A.).
December 27, afternoon. Section I, Professor Friedrich Bachmann
(Kiel, Germany), Professor Alfred Tarski (Berkeley, California, U.S.A.).
December 28, morning. Section II, Professor Ernest Adams (Berkeley,
California, U.S.A.), Professor Yoshio Ueno (Hiroshima, Japan).
December 28, afternoon. Section I, Mr. Dana Scott (Princeton, New
Jersey, U.S.A.), Professor H. L. Royden (Stanford, California, U.S.A.),
Professor Raphael M. Robinson (Berkeley, California, U.S.A.).
December 30, morning. Section II, Professor Arthur G. Walker
(Liverpool, England), Professor Patrick Suppes (Stanford, California,
U.S.A.).
December 30, afternoon. Commemorative talks on the first anniversary
of the death of Heinrich Scholz by Alfred Tarski and Paul Bernays.
Section I, Professor Paul Szasz (Budapest, Hungary), Professor Wanda
Szmielew (Warsaw, Poland, and Berkeley, California, U. S.A.).
December 31, morning. Section II, Professor Irving E. Segal (Chicago,
Illinois, U.S.A.), Professor Jean-Louis Destouches (Paris, France).
December 31, afternoon. Section I, Professor Leonard M. Blumenthal
(Columbia, Missouri, U.S.A.), Professor Herbert Busemann (Los Angeles,
California, U.S.A.).
January 2, morning. Section III, Professor Joseph H. Woodger
(London, England), Professor Richard Braithwaite (Cambridge, England).
January 2, afternoon. Section I, Professor Karol Borsuk (Warsaw,
Poland), Professor Bjarni Jonsson (Minneapolis, Minnesota, U.S.A.).
January 3, morning. Section II, Professor Pascual Jordan (Hamburg,
Germany), Dr. Paulette Fevricr (Paris, France).
January 3, afternoon. Section I, Proiessor Arend Heyting (Amsterdam,
Netherlands), Professor Adolf Grunbaum (Bethlehem, Pennsylvania,
U.S.A.).
January 4, morning. Section II, Professor Alfred Lande (Columbus,
Ohio, U.S.A.), Professor Herman Rubin (Eugene, Oregon, U.S.A.).
January 4, afternoon. Section III, Professor Karl Menger (Chicago,
PREFACE XI
Illinois, U.S.A.), Professor Raymond L. Wilder (Ann Arbor, Michigan,
U.S.A.).
Three invited speakers whose papers are included in this volume were
unable actually to attend the symposium: the paper of Paul Szasz was
read by Steven Orey, and the papers of Alexandre Froda and Herbert
Simon were presented by title. Several talks were presented originally
under different titles than appear in this volume: R. B. Braithwaite,
Necessity and contingency in the empirical interpretation of axiomatic
systems', A. Lande, Non-quantal foundations of quantum mechanics] H. L.
Royden, Binary relations as primitive notions in geometry with set-theoretical
basis', A. G. Walker, Axioms of kinematical relativity.
This symposium was jointly sponsored by the U. S. National Science
Foundation (which contributed the bulk of the supporting funds), the
International Union for the History and Philosophy of Science (Division
of Logic, Methodology, and Philosophy of Science), and the University
of California. The symposium was organized by a committee consisting
of Leon Henkin, Secretary (University of California, Berkeley), Victor F.
Lenzen (University of California, Berkeley), Benson Mates (University of
California, Berkeley), Ernest Nagel (Columbia University, New York),
Steven Orey (University of California, Berkeley, and University of
Minnesota, Minneapolis), Julia Robinson (Berkeley, California), Patrick
Suppes (Stanford University, Stanford, California), Alfred Tarski,
Chairman (University of California, Berkeley), and Raymond L. Wilder
(University of Michigan, Ann Arbor). The Secretary of the symposium
was Dorothy Wolfe.
We gratefully acknowledge the help of Mr. Rudolf Grewe and Dr.
Dana Scott in preparing this volume for publication.
University of California, Berkeley THE EDITORS
Stanford University
February 1959
Symposium on the Axiomatic Method
DIE MANNIGFALTIGKEIT DER DIREKTIVEN FttR DIE
GESTALTUNG GEOMETRISCHER AXIOMENSYSTEME
PAUL BERNAYS
Eidgendssische Technische Hochschule, Zurich, Schweiz
Bei der Betrachtung der Axiomatisierungen der Geometric stehen wir
unter dem Eindruck der grossen Mannigfaltigkeit der Gesichtspunkte,
unter denen die Axiomatisierung erfolgen kann und auch schon erfolgte.
Die urspriingliche einfache alte Vorstellung, wonach man schlechtweg
von den Axiomen der Geometric sprechen kann, ist nicht nur durch die
Entdeckung der nicht euklidischen Geometrien verdrangt, und ferner
auch durch die Einsicht in die Moglichkeit verschiedener Axiomatisierun-
gen einer und derselben Geometric, sondern es sind iiberhaupt wesentlich
verschiedene methodische Gesichtspunkte aufgetreten, unter denen man
die Axiomatisierung der Geometric unternommen hat und deren Zielset-
zungen sogar in gewissen Beziehungen antagonistisch sind.
Der Keim fur diese Mannigfaltigkeit ist bereits in der euklidischen
Axiomatik zu finden. Fur deren Gestaltung war der Umstand bestim-
mend, dass man hier an Hand der Geometric zum ersten Mai auf die
Problemstellung der Axiomatik gefiihrt wurde. Die Geometric ist hier
sozusagen die Mathematik schlechthin. Das Verhaltnis zur Zahlentheorie
ist methodisch wohl kein vollig deutliches. In gewissen Teilen wird ein
Stuck Zahlentheorie mit Verwendung der auschaulichen Zahlvorstcllung
entwickelt. Ferner wird in der Proportionenlehre inhaltlich von dem
Zahlbcgriff Gebrauch gcmacht, sogar mit einem impliziten Einschluss des
Tertium non datur ; allerdings scheint es, dass man dessen voile Verwen-
dung zu vermeiden trachtete.
Wahrend die methodische Sonderstellung des Zahlbcgriffes hier nicht
explicite hervortritt, wird der Grossenbegriff ausdriicklich als inhaltliches
Hilfsmittel an die Spitze gestellt, in einer Art \ibri gens, die wir heute nicht
mehr konzedieren konnen, indem namlich von verschiedenen Gegen-
standlichkeiten als selbstverstandlich vorausgesetzt wird, dass sie Grossen-
charakter haben. Der Grossenbegriff wird freilich auch der Axiomatisie-
rung unterworfen; die diesbeziiglichen Axiome werden jedoch ausdriick-
lich als vorgangige (KOIVO.I evvoial) von den ubrigen Axiomen abgesondert.
1
2 PAUL BERNAYS
Diese Axiome sind von ahnlicher Art, wie diejenigen, die man heute fur
die abelschen Gruppen aufstellt. Was aber auf Grund des damaligen me-
thodischen Standpunktes unterblieb, war, dass nicht axiomatisch fixiert
wurde, welche Gegenstande als Grossen anzusehen seien.
Umsomehr ist es zu bewundern, dass man damals schon auf das Be-
sondere derjenigen Voraussetzung aufmerksam wurde, durch welche die
archimedischen Grossen, wie wir sie heute nennen, ausgezeichnet werden.
Das Archimedische (Eudoxische) Axiom wird dann, in der an die Griechen
anschliessenden mittelalterlichen Tradition, insbesondere in den Unter-
suchungen der Araber uber das Parallelenaxiom wesentlich benutzt. Auch
bei dem Beweis von Saccheri zur Ausschliesung der ,,Hypothese des
stumpfen Winkels" tritt es als wesentlich auf. In der Tat ist ja diese
Ausschliessung ohne das Archimedische Axiom nicht moglich, da ja eine
nicht-archimedische, schwach-spharische (bzw. schwach-elliptische) Geo-
metric mit den Axiomen der euklidischen Geometric, abgesehen vom
Parallelenaxiom, im Einklang steht.
Bei alien diesen Untersuchungen tritt das zweite Stetigkeitsaxiom, wel-
ches im spateren 19.ten Jahrhundert formuliert wurde, noch nicht auf.
Es konnte bei den Beweisfiihrungen, fur die es in Betracht kam — wie bei
den Flacheninhalts- und Langenbestimmungen — auf Grund der er-
wahnten Verwendung des Grossenbegriffs, entbehrt werden, wonach es
z.B. als selbstverstandlich gait, dass die Kreisflache sowie der Kreisum-
fang eine bestimmte Grosse besitzen. An die Stelle der altcn Grossenlehrc
trat zum Beginn der Neuzeit als beherrschende iibergeordneteDisziplin die
Grossenlehre der Analysis, die sich formal und dem Inhalt nach sehr
reich entwickelte, noch ehe sie zu methodischer Deutlichkeit gelangte.
Freilich, bei der Entdeckung der nichteuklidischen Geometric spiel te
die Analysis zunachst keine erhebliche Rolle, wohl aber wird sie domi-
nierend in den nachfolgenden Untersuchungen von Riemann und Helm-
holtz, und spater von Lie, zur Kennzeichnung der drei ausgezeichnet en
Geometrien durch gewisse sehr allgemeine, analytisch fassbare Bedingun-
gen. Charakteristisch fur diese Behandlung der Geometric ist insbeson-
dere, dass man nicht nur die einzelnen Raumgcbilde, sondern auch di,e
Raummannigfaltigkeit selbst zum Gegenstand nimmt. In der Moglich-
keit der Durchfiihrung einer solchen Betrachtung zeigten sich die ge-
waltigen begrifflichen und formalen Mittel, welche die Mathematik in der
Zwischenzeit gewonnen hatte; und in der Anlage der Problemstellung
ausserte sich die begrifflich-spekulative Richtung, welche die Mathematik
im Laufe des 19.ten Jahrhunderts einschlug.
GESTALTUNG GEOMETRISCHER AXIOMENS YSTEME 3
Die differentialgeometrische Behandlung der Grundlagen der Geo-
metric ist ja iibrigens bis in die neueste Zeit durch Hermann Weyl sowie
Elie Cartan und Levi-Civita, in Ankniipfung an die allgemeine Relativi-
tatstheorie Einsteins, weiter entwickelt worden. So imponierend und ele-
gant das in dieser Hinsicht Erreichte ist, so haben sich doch die Mathe-
matiker vom grundlagentheoretischen Standpunkt damit nicht zufrieden
gegeben. Zunachst suchte man sich von der fur die differentialgeometri-
sche Methode wesentlichen Voraussetzung der Differenzierbarkeit der
Abbildungsfunktionen zu befreien. Dafiir bedurfte es der Ausbildung der
Methoden einer allgemeinen Topologie, welche um die Wende des Jahr-
hunderts begann und seitdem eine so imposante Entwicklung .genommen
hat. Weitergehend trachtete man sich von der Voraussetzung des archi-
medischen Charakters der geometrischen Grossen iiberhaupt unabhangig
zu machen.
Diese Tendenz steht im Zeichen derjenigen Entwicklung, mit welcher
die Analysis ihre vorher beherrschende Stellung in gewissem Masse ein-
gebiisst hat. Dieses neue Stadium in der mathematischen Forschung
kniipfte sich an die Auswirkung der schon erwahnten begrifflich-speku-
lativen Richtung der Mathematik im 19.tenjahrhundert, wie sie insbeson-
dere in der Schopfung der allgemeinen Mengenlehre, in der scharferen Be-
griindung der Analysis, in der Konstitution der mathematischen Logik
und in der neuen Fassung der Axiomatik in Erscheinung trat.
Fur dieses neue Stadium war zugleich charakteristisch, dass man wieder
mehr auf die Methoden der alten griechischen Axiomatik zuruckkam, wie
es wiedcrholt in den Epochen geschah, in denen man auf begriffliche
Prazision starkeren Nachdruck legte. In Hilberts Grundlagen der Geo-
metric finden wir einerseits dieses Zuriickkommen auf die alte elementare
Axiomatik, freilich in grundsatzlich veranderter methodischer Auffas-
sung, andererseits als ein hauptsachliches Thema die moglichst weit-
gehende Ausschaltung des archimedischen Axioms: sowohl bei der Pro-
portionenlehre wie beim Flacheninhaltsbegriff sowie in der Begriindung
der Streckenrechnung. Diese Art der Axiomatisierung hatte iibrigens fiir
Hilbert nicht den Sinn der Ausschliesslichkeit ; er hat ja bald danach eine
andere Art der Begriindung daneben gestellt, mit der zum erst en Mai das
vorhin erwahnte Programm einer topologischen Grundlegung aufgestellt
und durchgefiihrt wurde.
Etwa gleichzeitig mit Hilberts Grundlegung wurde auch in der Schule
von Peano und Pieri die Axiomatisierung der Geometrie gepflegt. Bald
folgten auch die axiomatischen Untersuchungen von Veblen und R. L.
4 PAUL BERNAYS
Moore; und es waren nunmehr die Forschungsrichtungen eingeschlagen,
in denen sich auch heute die Beschaftigung mit den Grundlagen der Geo-
metric weiterbewegt. Als kennzeichnend hierfur haben wir eine Vielheit
der methodischen Richtungen.
Die eine ist die, welche die Mannigfaltigkeit der kongruenten Transfor-
mationen durch moglichst allgemeine und pragnante Bedingungen zu
kennzeichnen sucht, die zweite, diejenige, welche die projektive Struktur
des Raumes voranstellt und das Metrische auf das Projektive mit der
von Cayley und Klein ausgebildeten Methode der projektiven Mass-
bestimmung zuruckzufiihren trachtet, und die dritte die, welche auf eine
elementare Axiomatisierung der vollen Kongruenzgeometrie ausgeht.
Verschiedene wesentlich neue Gesichtspunkte sind in der Entwicklung
dieser Richtungen hinzugetreten. Einmal erhielt die projektive Axioma-
tik eine verstarkte Systematisierung mittels der Verbandstheorie. Ferner
wurde man gewahr, dass man bei der Kennzeichnung der Gruppe der
kongruenten Transformationen die mengentheoretischen und funktionen-
theoretischen Begriffbildungen zurlicktreten lassen kann, indem man die
Transformationen durch sie bestimmende Gebilde festlegt. Damit kommt
das Verfahren dem der element aren Axiomatik nahe, da die Gruppen-
beziehungen sich nun als Beziehungen zwischen geometrischen Gebilden
darstcllen.
Ich will aber hier nicht naher von diesen beiden Forschungsrichtungen
der geometrischen Axiomatik sprechen, fur die ja hier authentischere
Vertreter anwesend sind, auch nicht von den Erfolgen, die mit Verwen-
dung topologischer Methoden erzielt worden sind, woriiber insbesondere
neuere Abhandlungen von Freudenthal einen Uberblick liefern, sondern
mich den Fragen der an drittcn Stelle genannten Richtung der Axioma-
tisierung zuwenden.
Selbst innerhalb dieser Richtung finden wir wiederum eine Mannig-
faltigkeit von moglichen Zielsetzungen. Man kann einerseits darauf aus-
gehen, mit moglichst wenigen Grundelementen, etwa nur einem Grund-
pradikat und einer Gattung von Individuen, auszukommen. Anderer-
seits kann man vornehmlich darauf gerichtet sein, natiirliche Absonde-
rungen von Teilen der Axiomatik hervortreten zu lassen. Diese Gesichts-
punkte fiihren zu verschiedenen Alternativen.
So wird einerseits durch die Betrachtung der nichteuklidischen Geo-
metric die Voranstellung der ,,absoluten" Geometric nahegelegt. Anderer-
seits hat auch ein solcher Aufbau manches fur sich, bei dem die affine
Vektorgeometrie vorangestellt wird, wie es am Anfang von Weyl's
GESTALTUNG GEOMETRISCHER AXIOMENS YSTEME 5
,,Raum, Zeit, Materie" geschieht. Diesen beiden Gesichtspunkten kann
man schwcrlich zugleich in eincr Axiomatik Geniige tun. Ein anderes Bei-
spiel ist dieses. Bei der Voranstellung der Axiome der Inzidenz und An-
ordnung ist es eine mogliche und elegante begriffliche Reduktion, dass
man, nach dem Vorgang von Veblen, den Begriff der Kollinearitat auf
den Zwischen-Begriff zuruckfiihrt. Andererseits ist es fur manche t)ber-
legungen von Wichtigkeit, die von dem Anordnungsbegriff unabhangigen
Folgerungen der Inzidenzaxiome abzusondern; so ist es ja wiinschenswert
die Begrundung der Streckenrechnung aus den Inzidenzaxiomen als un-
abhangig von den Anordnungsaxiomen zu erkennen. Wiederum bei der
Theorie der Anordnung selbst hat man Ersparungen von Axiomen der
linearen Anordnung durch Anwendung des Axioms von Pasch als mog-
lich erkannt ; andererseits ist in gewisser Hinsicht eine Anlage der Axiome
zu bevorzugcn, bei welcher die fur die lineare Anordnung kennzeichnenden
Axiome abgesondert werden.
Mit diesen Beispielen von Alternativen ist die Mannigfaltigkeit in den
moglichen und auch den tatsachlich verfolgten Zielsetzungen nicht an-
nahernd erschopft. So ist es ein moglicher und sinngemasser, wenn auch
nicht obligatorischer regulativer Gesichtspunkt, dass die Axiome so for-
muliert werden sollen, dass sie sich jeweils nur auf ein beschranktes
Raumstuck beziehen. Dieser Gedanke ist implicite ja wohl schon in der
euklidischen Axiomatik mitbestimmend ; und es mag auch sein, dass der
Anstoss, den man so fruhzeitig an dem Parallelenaxiom genommen hat,
gerade darauf beruht, dass in der euklidischen Formulierung der
Begriff der geniigend weiten Verlangerung auftritt. Die erstmalige
explizite Durchfuhrung des genannten Programmpunktes geschah
durch Moritz Pasch, und es kniipfte sich daran die Einfiihrung idealer
Elemente mit Hilfe von Schnittpunktsatzen, eine seitdem in erfolg-
reicher Weise ausgestaltete Methode der Begrundung der projektiven
Geometric.
Eine andere Art der moglichen zusatzlichen Aufgabestellung ist die-
jenige, die Unscharfe unseres bildhaften Vorstellens begrifflich nachzu-
ahmen, wie dieses ja Hjelmslev getan hat. Das ergibt freilich nicht nur
eine andere Art der Axiomatisierung, sondern iiberhaupt ein abweichendes
Beziehungssystem, ein Verfahren, welches wohl wegen seiner Komplika-
tion nicht viel Anklang gefunden hat. Doch auch ohne in dieser Richtung
sich soweit von dem Ublichen zu entfernen, kann man etwas in gewisser
Hinsicht Ahnliches anstreben, indem man den Begriff des Punktes als
Gattungsbegriff vermeidet, wie es ja in verschiedenen interessanten
6 PAUL BERNAYS
nuereen Axiomatisierungen geschieht, so insbesondere in derjenigen von
Huntington.
In solcher Weise zeigt sich auf mannigfachste Art, dass es kein eindeu-
tigcs Optimum fur die Gestaltung eines geometrischen Axiomensystems
gibt. Was iibrigens die Reduktionen in Hinsicht der Grundbegriffe und
der Dingarten betrifft, so ist ungeachtet des grundsatzlichen Int cresses,
welches jede solche Reduktionsmoglichkeit hat, doch immer daran zu
erinnern, dass die tatsachliche Anwendung einer solchen Reduktion sich
nur dann empfiehlt, wenn damit eine iibersichtliche Gestaltung des
Axiomensystems erreicht wird.
Es lassen sich immerhin gewisse Direktiven fur Reduktionen nennen,
die wir generell akzeptieren konnen. Nehmen wir etwa als Beispiel die
Hilbert'sche Fassung der Axiomatik. Bei dieser werden einerseits die
Geraden als eine Dinggattung genommen, andererseits die Halbstrahlen
als Punktmengen eingefiihrt und anschliessend dann die Winkel als ge-
ordnete Paare zweier von einem Punkt ausgehender Halbstrahlen, also
als Paare von Mengen, erklart. Hier sind tatsachlich Moglichkeiten der
vereinfachenden Reduktion gegeben. Man mag verschiedener Meinung
dariiber sein, ob man anstatt der verschiedenen Gattungen ,, Punkt,
Gerade, Ebene" nur eine Gattung der Punkte zugrunde legen will, wobei
dann anstelle der Inzidenzbeziehung die Beziehungen der Kollinearitat
und der Komplanaritat von Punkten treten. In der verbandstheoreti-
schen Behandlung werden ja die Geraden und Ebenen gleichstehend mit
den Punkten als Dinge genommen. Hier steht man wiederum vor einer
Alternative. Hingegen die Halbstrahlen als Punktmengen einzufiihren,
iiberschreitet jedenfals den Rahmen der elementaren Geometrie und ist
auch fur diese nicht notig. Generell konnen wir es wohl als Direktive
nehmen, dass hohere Gattungen nicht ohne Erfordernis eingefiihrt werden
sollen. Beim Fall der Winkeldefinition kann man das dadurch vermeiden,
dass man die Winkelaussagen auf Aussagen iiber Punkttripel reduziert,
wie dieses ja von R. L. Moore durchgefiihrt wurde. Hier wird sogar noch
eine weitere Reduktion erreicht, indem iiberhaupt die Winkelkongruenz
mit Hilfe der Streckenkongruenz erklart wird, doch findet hierbei wieder-
um auch eine gewisse Einbusse statt. Namlich die Beweisfiihrungen
stiitzen sich dabei wesentlich auf die Kongruenz von ungleichsinnig zu-
geordneten Dreiecken. Daher ist diese Art der Axiomatisierung nicht
geeignet fur den Problemkreis derjenigen Hilbert'schen Untersuchungen,
welche sich auf das Verhaltnis der gleichsinnigen Kongruenz zur Symme-
tric beziehen. Diese Bemerkung betrifft freilich auch die meisten der
GESTALTUNG GEOMETRISCHER AXIOMEN S YSTEME 7
Axiomatisierungen, bei denen der Begriff der Spiegelungen an der Spitze
steht.
Neben den allgemeinen Gesichtspunkten mochte ich als etwas Einzelnes
eine spezielle Moglichkeit der Anlage eines elementaren Axiomensystems
erwahnen, namlich eine solche Axiomatik, bei welcher der Begriff ,,das
Punktetripel a, b, c bildet bei b einen rechten Winke!" als einzige Grund-
beziehung und die Punkte als einzige Grundgattung genommen werden,
ein Programm, auf welches neuerdings durch eine Arbeit von Dana Scott
hingewiesen worden ist. Die genannte Beziehung geniigt der von Tarski
festgestellten notwendigen Bedingung fur ein allein ausreichendes Grund-
pradikat der Planimetrie. Im Vergleich mit dem fur eine Axiomatik sol-
cher Art vorbildlich gewordenen Verfahren Pieri's, der ja in einer Axioma-
tisierung die Beziehung ,,b und c haben von a gleichen Abstand" als
Grundbegriff nahm, scheint hier insofern eine Erleichterung zu bestehen,
als der Begriff der Kollinearitat von Punkten sich enger an den des rech-
ten Winkels als an den Pieri'schen Grundbegriff anschliesst. Was freilich
den Kongruenzbegriff anbelangt, so scheint sich fur die Axiome der Kon-
gruenz aus der betrachteten Reduktion keine Vereinfachung zu ergeben.
Ubrigens ist diese Axiomatisierung ebenso wie die genannte Pieri'sche
eine von denen, die keine Aussonderung der gleichsinnigen Kongruenz
liefern l.
Fur eine elementare Axiomatisierung der Geometric stellt sich als be-
sondere Frage die der Gewinnung einer Vollstandigkeit im Sinne der
Kategorizitat. Diese wird bei den meisten Axiom ensystem en durch die
Stetigkeitsaxiome erwirkt. Die Einfiihrung dieser Axiome bedeutet aber,
wie man weiss, eine Uberschreitung des Rahmens der gewohnlichen Pra-
dikatenlogik, indem das archimedische Axiom den allgemeinen Zahl-
begriff verwendet und das zweite Stetigkeitsaxiom den allgemeinen Pra-
dikaten- oder Mengenbegriff . Wir haben seither aus den Untersuchungen
Tarski's gelernt, dass wir eine Vollstandigkeit, wenigstens im deduktiven
Sinne, in einem elementaren Rahmen erreichen konnen, wobei das Be-
merkenswerte ist, dass das Schnittaxiom in einer gewissen Formalisierung
erhalten bleibt, wahrend von dem Archimedischen Axiom abgesehen wird.
Das Archimedische Axiom fallt ja insofern formal aus dem sonstigen
Rahmen heraus, als es in logischer Formalisierung die Gestalt einer un-
l) Einige Angaben iiber die Definitionen der Inzidenz-, Anordnungs- und Kon-
gruenzbegriffe aus dem Begriff des rechten Winkels, sowie iiber einen Teil des
Axiomensystems folgen in einem Anhaiig.
8 PAUL BERNAYS
endlichen Alternative hat, wahrend das Schnitt axiom auf Grund seiner
Form der Allgemeinheit sich durch ein Axiomenschema darstellen und
dadurch in seiner Anwendung dem jeweiligen formalen Rahmen anpassen
lasst, — wobei dann fiir den elementaren Rahmen der Pradikatenlogik
die Beweisbarkeit des Archimedischen Axioms aus dem Schnittaxiom
verloren geht. Freilich hat eine solche Beschrankung auf einen pradikaten-
logischen Rahmen zur Folge, dass verschiedene Uberlegungen nur meta-
theoretisch ausgefuhrt werden konnen, wie z.B. der Beweis des Satzes,
dass ein einfach geschlossenes Polygon die Ebene zerlegt, und ebenso die
Betrachtung iiber Erganzungsgleichheit und Zerlegungsgleichheit von
Polygonen. Man steht hier wieder einmal vor einer Alternative, namlich
der, ob man den Gesichtspunkt der Elementaritat des logischen Rahmens
voranstellen will, oder sich hinsichtlich des logischen Rahmens nicht be-
schrankt, wobei ja iibrigens noch verschiedene Abstufungen in Betracht
kommen.
In Bezug auf die Anwendung einer Logik der zweiten Stufe sei hier nur
daran erinnert, dass eine solche sich ja im Rahmen der axiomatischen
Mengenlehre in solcher Weise prazisieren lasst, dass keine fiihlbare Ein-
schrankung der Beweismethoden erfolgt. Auch das Skolem'sche Para-
doxon bereitet im Falle der Geometrie insofern keine eigentlichc Ver-
legenheit, als man es dadurch ausschalten kann, dass man in den model-
theoretischen Betrachtungen den Mengenbegriff , der in einem der hoheren
Axiome auftritt, mit dem Mengenbegriff der Modelltheorie gleichsetzt.
Zum Schluss mochte ich hervorheben, dass der in meinen Ausfiihrungen
betonte Umstand, dass es in der Gestaltung der Axiomatik kein eindeuti-
ges Optimum gibt, keineswegs bedeutet, dass die Erzeugnisse der geome-
trischen Axiomatik notwendig den Charakter des Unvollkommenen und
Fragmentarischen tragen. Sie wissen, dass auf diesem Gebiete etliche
Gestaltungen von grosser Vollkommenheit und Abrundung erreicht wor-
den sind. Gerade die Vielheit der moglichen Zielrichtungen bewirkt, dass
durch das Neuere das Friihere im allgemeinen nicht schlechtweg iiberholt
wird, wahrend andererseits auch jede erreichte Vollkommenheit immer
noch Platz lasst fiir weitere Aufgaben.
ANHANG. Bemerkungen zu der Aufgabe einer Axiomatisierung der eukli-
dischen Planimetrie mit der einzigen Grundbeziehung R(a, b, c) : ,,das
Punktetripel a, b, c, bildet bei b einen rechten Winkel". Die Axiomatisierung
gelingt insoweit auf einfache Art, als nur die Beziehungen der Kollineari-
tat und des Parallelismus betrachtet werden. Fiir die Theorie der Kolli-
GESTALTUNG GEOMETRISCHER AXIOME NS YSTEME 9
nearitat geniigen die f olgenden Axiome :
Al -^R(atb,a)
A2 R(a, b, c) -> R(c, b, a) & -,/?(«, c, b) *
A3 R(a, b, c) & R(a, b, d) & R(e, b, c) -> #(*, 6, d)
A4 /?(«, 6, c) & R(a, b,d)&c+d& R(e, c, 6) -> #(*, c, rf)
A5 a + b -+(Ex)R(a,b,x).
Dazu tritt die Definition der Beziehung Koll(#, b, c) : ,,die Punkte a,
b, c sind kollinear" :
DEFINITION 1. Koll(a, b, c) «-> (x)(R(x, a, b) ~*R(x, a, c)) v a = c.
Es sind dann die f olgenden Satze beweisbar:
(1) Koll(fl, b,c)<-*a = bva = cvb = cv (Ex)(R(xt a, b) & R(x, a, c))
(2) Koll(fl, b, c) -> Roll (a, c, b) & Roll (b, a, c)
(3) Koll(«, 6, c) & Koll(fl, b, d) & a + b -> Koll(6, c, ^)
(4) R(a, b, c) & Roll (b, c,d)&b=td-> R(a, 6, d)
(5) /?(«, 6, c) -> -, Koll(«, 6, c)
(6) #(«, 6, c) & R(at 6, rf) -> Koll(6, c, ^)
(7) R(at b, c) & R(a, 6, d) -> -./?(«, c, <J).
Zum Beweis: Koll(c, ^, b) & c 4= 6 -> (/?(«, c, rf) -> R(a, c, b))
(8) /e(«, b, c) & tf (a, 6, d) & #K c, c) & R(at e,d) -+c = dv b = e.
Zum Beweis : Koll(ft, c, rf) & Koll(^f c,d)&c=£d-+ Koll(6, c, e)
, 5) & 6 4= « & 7?(a, 6, c) -* /?(«, 6, «)
, ft) & 6 =4= ^ & 7?(a, e, c) -> 7?(a, «, 6)
R(a,b,e) -+-&(<&, e,b).
Fiir die Theorie des Parallelismus nehmen wir zwei weitere Axiome
hinzu :
A6 04=&&a=M-> (Ex)(R(xt a, b) & R(x, a, c)) v
(Ex)(R(a, x, b) & R(a, x, c)) v R(a, b, c) v R(a, c, b)
2) Durch dieses Axiom wird bereits die elliptische Geometrie ausgeschlossen.
10 PAUL BERNAYS
Das Axiom besagt in iiblicher Ausdrucksweise, dass man von einem
Punkte a ausserhalb einer Geraden be auf diese eine Senkrechte fallen
kann. Die eindeutige Bestimmtheit der Senkrechten in Abhangigkeit von
dem Punkt a und der Geraden be ergibt sich mit Hilfe von (4) und (8).
A7 R(a, b, c) & R(b, c, d) & R(c, d, a) -> R(d, a, b)
Dieses ist eine Form des euklidischen Parallelenaxioms im engeren,
winkelmetrischen Sinn.
Die Parallelitat wird nun definiert durch :
DEFINITION 2. Par(a, b,c,d) «-*a=4=6&c=M& (Ex)(Ey)(R(a, x, y) &
R(b, x, y) & R(c, y, x) & R(d, y, x))
Als beweisbare Satze ergeben sich :
(9) Par (0, b ; c, d) -> Par(6, a;c,d)& Par(c, d\a,b)
(10) Pzr(a,b;c,d) ^a+c&a^d&b^c&b+d
(1 1) Par(a, b; cfd)^a^=b&c^d&: (Ex)(Eu)((R(a, x, u) v x = a) &
& (R(b, x, «) v x = b) & (R(x, u,c)vu=;c)& (R(x, u,d)yu = d))
Fur den Beweis der Implikation von rechts nach links hat man zu zei-
gen, dass auf einer Geraden a, b, mindestens fiinf verschiedene Punkte
liegen, was mit Hilfe der Axiome A1-A6 gelingt.
( 1 2) Par(a, b ; c, d) -> (x) ((R(a, x, c) v x = a) & (R(b, x, c) v
( 1 3) Par (a, b;c,d)& Koll(a, b, e) & b 4= e -> Par (b, e\c,d)
und daraus insbesondere
( 1 4) Par(0, b\ctd) -> -iKoll(a, b, c) ;
ferner
(15) Par(a, b;c,d)& Koll(a, b, e) -* -,Koll(c, d, e)
(16) -,Koll(a, 6, c) -> (£*)Par(a, 6; c, x)
(17) Par (a, 6; c, d) & Par(a, 6; c, e) -> Koll(c, d, e)
(18) Par(fl,6;c,rf)&Par(«, 6;«,/) ->
-> Par(c, d\e,f)v (Koll(^, c, <J) &
An den Begriff des Parallelismus kniipft sich noch der der Vektor-
GESTALTUNG GEOMETRISCHER AXIOMENS YSTEME 11
gleichheit: ,,a, b und c, d sind die Gegenseiten eines Parallelogramms" :
DFINITION 3. Pag(a b'tc,d)<-+ Par(0, b,c',d)& Par(a, c ; b, d)
Man kann hiermit beweisen:
(19) Pag(a, b\ c, d) -> Pag(c, d;a,b)& Pag(a, c; M)
(20) Pag(a, 6; c, d) & Pag(a, 6; c, «) -> ^ = e
(21) Pag(a, 6; c, d) -> -,Koll(a, 6, c).
Fur den Beweis des Existenzsatzes
(22) -,Koll(fl, 6, c) -> (Ex)(Pag(a, b\c,x)
bedarf es noch eines weiteren Axioms:
A8 R(at b, c) -> (Ex)(R(at c, x) & R(c, b, x)).
Mil Hilfe dieses Axioms ist generell beweisbar, dass zwei verschiedene,
nicht parallele Geraden einen Schnittpunkt besitzen:
(23) -iKoll(a, b, c) & -nPar(«, b ; c, d) ->
-> (Ex)(KoU(a, b, x) & Koll(c, d, x)). -
Ob sich im Ganzen eine iibersichtliche Axiomatik mil dem Grund-
begriff R erreichen lasst, bleibe dahingestellt. Wir begniigen uns hier
damit, Definitionen fur die wesent lichen weiteren Begriffe aufzustellen.
Fur diese lasst sich immerhin eine gewisse Ubersichtlichkeit erreichen.
An die Figur des Parallelogramms kniipfen sich die folgenden zwei
verschiedenen Definitionen der Beziehung ,,a ist Mittelpunkt der Strecke
b,c":
DEFINITION 4i Mpi(a\ b, c) <-> (Ex) (Ey) (Pag(£, x\ y, c) &
& Ko\l(a, b, c) & Koll(a, x, y))
DEFINITION 42 Mp^(a\ b, c) *-> (Ex)(Ey)(Pag(x, y, a,b) & Pag(^, y; c, a)).
Im Sinne der zweiten Definition kann man die Moglichkeit der Ver-
doppelung einer Strecke beweisen:
(24) a 4= b -> (Eu)Mp2(a't b, u).
Die Existenz des Mittelpunktes einer Strecke im Sinne der Df . 4i, d. h.
(25) b 4= c -> (Eu)Mpi(u; b, c)t
lasst sich beweisen, wenn man noch das Axiom hinzunimmt:
12 PAUL BERNAYS
A9 Par (a, b',c,d)& Par (a, c; b, d) -> -nPar(a, d; 6, c).
(Im Parallelogramm schneiden sich die Diagonalen)
Durch Spezialisierung der zur Definition von Mp\ gehorigen Figur
erhalten wir eine Definition der Beziehung ,,a, b, c bilden ein gleich-
schenkliges Dreieck mit der Spitze in a":
DEFINITION 5i. Ish(a\ b, c) <-> (Eu)(Ev)(Pa.g(a, b; c, v) & R(a, u, b) &
&R(a,u,c) &R(b,u,v)).
Mit Hilfe von Mp\ und Ist\ konnen wir den Pieri'schen Grundbegriff :
,,a hat von b und c gleichen Abstand" definieren:
DEFINITION 6. 7si(a; b, c) <-> b = c v Mp\(a\ b, c) v Ist\(a\ b, c).
Eine andere Art der Definition des Begriffes 7s beruht auf der Verwen-
dung der Symmetrie. Hierzu dient folgender Hilfsbegriff : ,,a, b, c, d, e
bilden ein ,,normales" Quintupel":
DEFINITION 7. Qn(a, b, c, d, e) «-> R(a, c, b) & R(a, d, b) &
& R(a, e, c) & R(a, e, d) & 7^(6. e,c)&c3=d.
Mit Hilfe von Qn erhalten wir eine weitere Art der Definition fiir Mp
und 1st:
(43. Mp*(a\ b, c) <-> (Ex)(Ey)Qn(x, y, b, c, a)
DEFINITION-!
|52. Ist2(a; b, c) *-> (Ex)(Ey)Qn(a, x, b, c, y),
aus denen sich 7$2 entsprechend wie Isi definieren lasst.
Ferner schliesst sich hieran noch die Definition der Spiegelbildlichkeit
von Punkten a, b in Bezug auf eine Gerade c d:
DEFINITION 8. Sym(«, b\c,d) <-> c 4= d & (Ex)(Ey)(Ez)(KoYL(x, c, d) &
& Koll(y, c, d) & Qn(x, y, a, bt z)).—
Fiir die Definition der Streckenkongruenz brauchen wir schliesslich
noch den Begriff der gleichsinnigen Kongruenz auf einer Geraden: ,,die
Strecken a b und c d sind kollinear, kongruent und gleichgerichtet" :
DEFINITION 9i. Lgi(a,b;c, d) <-> Koll (a, b, c) &
&(£*)(£y)(Pag(a, x\ b, y) & Pzg(c,x;d, y)),
oder auch:
DEFINITION 92. Lgz(a, b; c, d) <-> Koll(a, b, c) & a ^ b &
& (Ex)(Mp(x\ b, c) & Mp(x\ a, d)) v (a = d & Mp(a\ 6, c))
v (b = c & Mp(b',a,d)),
GESTALTUNG GEOMETRISCHER AXIOMENSYSTEME 13
(wobei fur Mp eine der drei obigen Definitionen genommen werden kann.
Nunmehr kann im Ganzen (mit jeder der beiden Definitionen von Lg)
die Streckenkongruenz definiert werden:
DEFINITION 10. Kg(a, b\c,d) <-» Lg(a, b\ c, d) v Lg(a, b\ d, c) v
v (a = b & Isi(a; b, d)) v (Ex)(P*g(a, b\ c, x) & Isi(c; x, d)).
Durch eine Definition analog derjenigen von Lg2 kann man auch die
Kongruenz von Winkeln mit gleichem Scheitelpunkt als sechsstellige
Beziehung einfiihren, nachdem man vorher den Begriff der Winkel-
halbierenden eingefiihrt hat: ,,d(=\= a) liegt auf der Halbierenden des
Winkels b a c":
DEFINITION 11. Wh(0, d;b,c) «-» -iKoll(a, b, c) &
& (Ex)(Ey)(Ez)(Ko\\(a, c, x) & Koll(a, d, y) & Qn(a, y, b, x, z)).
In Anbetracht des sehr zusammengesetzten Charakters dieser Kon-
gruenzbeziehung Kg wird man in der Axiomatisierung die Gesetze iiber
Kg auf solche der als Bestandteile des definierenden Ausdrucks auftre-
tenden Begriff e zuruckfuhren. Dabei bestehen auf Grund der Mehrheit
der Definitionen von Mp, 1st, Is Alternativen in Hinsicht darauf , ob man
in starkerem Masse die Beziehungen des Parallelismus oder die der Sym-
metric heranzieht. Auf jeden Fall diirfte das Axiom der Vektorgeometrie
A 10 Pag(«, b\ p, q) & Pag(6, c\q,r) -> Pag(0, c;p,r)v
v (Koll(«, c, p) & Koll(a, c, r))
oder ein gleichwertiges zweckmassig sein. Im Ganzen konnte man sich
hierbei als Ziel setzen, das in der eulkidischen Planimetrie vorliegende
Zusammenspiel von Parallelismus und Spiegelung auf eine moglichst
symmetrische Art zur Darstellung zu bringen.
Was endlich die Zwischenbeziehung betrifft, so ist die Figur fur die
Definition der Beziehung ,,a liegt zwischen b und c" schon als Bestandteil
in derjenigen von Qn enthalten. Namlich wir konnen definieren:
DEFINITION 12. Zw(a; 6, c) <-> (Ex}(R(by a, x) & R(ct a, x) & R(b, x, c)).
Fur diesen Begriff sind zunachst beweisbar:
(26) -nZw(fl;6, b)
(27) Zw(«;6,c) ->Zw(«;c, b)
14 PAUL BERNAYS
(28) Zw(«;6, c) -> Koll(a, b,c)
und ferner mit Benutzung von A5, A6 und A8
(29) a + b -> (Ex)Zw(x; a, 6) & (Ex)Zw(b; a, x).
Fur die Gewinnung der weiteren Eigenschaften des Zwischenbegriffes
konnen die f olgenden Axiome dienen :
Al 1 R(a, b, c) & R(a, b, d) & R(c, a, d) & R(e, c, b) -> -^R(bt e, d)
A12 R(a, b, d) & R(d, b, c) & a =J= c -> Zw(a; b, c) v Zw(6; a, c) v
v Zw(c ; a, 6)
A13 Zw(a; 6, c) & Zw(6; a, d) ->Zw(0; c, <J)
A14 7?(a, 6, rf) & R(d, b, c) & R(a, c, e) & Zw(d; a, e) -> Zw(6; a, c)
Aus diesem Axiom kann man in einigen Schritten den allgemeineren
Satz gewinnen :
(30) Zw(b\ a, c) & Koll(«, d, e) & Par(6, d\ c, e)
Dieses gelingt mit Verwendung des Satzes
(31) R(a, b, e) & R(e, b, c) & R(b, a, d) & R(b, c, /) & 7^(6, e, d) &
& «(6, e, /) & Zw(6; a, c) -> Zw(^; 4, /),
welcher sich aus dcm vorhin erwahnten Axiom A10 ablcitcn lasst.
Mit Hilfe von (30) und dem Axiom A 13 lasst sich beweisen:
(32) -,Koll(«, b, c) & Zw(b; a, d) & Zw(e; b, c,) ->
-> (Ex)(Ko\l(e, d, x) & Zvi(x; a,c))t
dh. das Axiom von Pasch in der engeren Veblen'schen Fassung.—
Anschliessend sei noch die folgende Definition von Kg mittels der Be-
griffe 7s und Zw erwahnt, welche auf einer Konstruktion von Euklid
beruht :
DEFINITION 13. Kg*(a, b] c, d) «-» (Ex)(Ey)(Ez)(Is(x, a\ c) &
& Zw(y\ a, x) & Zw(z; c, x) & Is(a\ b, y) & Is(c\ d, z) & Is(x\ yt z)).
(Fiir 7s kann hier nach Belieben 7si oder Is% genommen werden.)
Von einer Axiomatik wie der hier geschilderten, bei der die Kollinea-
ritat und die Zwischenbeziehung mit der Orthogonalitat verkoppelt wird,
kann man freilich nicht verlangen, dass sie eine Absonderung der Axiome
GESTALTUNG GEOMETRISCHER AXIOMENS YSTEME 15
des Linearen liefert. Ferner 1st die Anlage hier von vornherein im Hin-
blick auf die Planimetrie beschrankt, da die Definition der Kollinearitat
im Mehrdimensionalen nicht mehr anwendbar ist. Auch die Beschrankung
auf die euklidische Geometrie wird schon an fruher Stelle eingefiihrt.
Andererseits kann diese Axiomatisierung sich besonders dafiir eignen, die
grosse Einfachheit und Eleganz der Gesetzlichkeit der euklidischen Pla-
nimetrie hervortreten zu lassen.
Symposium on the Axiomatic Method
WHAT IS ELEMENTARY GEOMETRY?
ALFRED TARSKI
Institute for Basic Research in Science,
University of California, Berkeley, California, U.S.A.
In colloquial language the term elementary geometry is used loosely to
refer to the body of notions and theorems which, following the tradition
of Euclid's Elements, form the subject matter of geometry courses in
secondary schools. Thus the term has no well determined meaning and
can be subjected to various interpretations. If we wish to make elementa-
ry geometry a topic of metamathematical investigation and to obtain
exact results (not within, but) about this discipline, then a choice of a
definite interpretation becomes necessary. In fact, we have then to
describe precisely which sentences can be formulated in elementary
geometry and which among them can be recognized as valid; in other
words, we have to determine the means of expression and proof with
which the discipline is provided.
In this paper we shall primarily concern ourselves with a conception of
elementary geometry which can roughly be described as follows: we
regard as elementary that part of Euclidean geometry which can be formulated
and established without the help of any set-theoretical devices. l
More precisely, elementary geometry is conceived here as a theory with
standard formalization in the sense of [9]. 2 It is formalized within elc-
1 The paper was prepared for publication while the author was working on a
research project in the foundations of mathematics sponsored by the U.S. National
Science Foundation.
2 One of the main purposes of this paper is to exhibit the significance of notions
and methods of modern logic and metamathematics for the study of the foundations
of geometry. For logical and metamathematical notions involved in the discussion
consult [8] and [9] (see the bibliography at the end of the paper) . The main meta-
mathematical result upon which the discussion is based was established in [7J. For
algebraic notions and results consult [11].
Several articles in this volume are related to the present paper in methods and
results. This applies in the first place to Scott [5] and Szmielew [6J, and to some
extent also to Robinson [3].
16
WHAT IS ELEMENTARY GEOMETRY? 17
mentary logic, i.e., first-order predicate calculus. All the variables*,)/, z, . . .
occurring in this theory are assumed to range over elements of a fixed set ;
the elements are referred to as points, and the set as the space. The logical
constants of the theory are (i) the sentential connectives — the negation
symbol -i, the implication symbol — >, the disjunction symbol v, and the
conjunction symbol A ; (ii) the quantifiers — the universal quantifier A
and the existential quantifier V ; and (iii) two special binary predicates —
the identity symbol = and the diversity symbol ^. As non-logical
constants (primitive symbols of the theory) we could choose any predi-
cates denoting certain relations among points in terms of which all
geometrical notions are known to be definable. Actually we pick two
predicates for this purpose: the ternary predicate ft used to denote the
betweenness relation and the quaternary predicate d used to denote the
equidistance relation; the formula fi(xyz) is read y lies between x and z
(the case when y coincides with % or z not being excluded), while 6(xyzu) is
read x is as distant from y as z is from u.
Thus, in our formalization of elementary geometry, only points are
treated as individuals and are represented by (first-order) variables.
Since elementary geometry has no set-theoretical basis, its formalization
does not provide for variables of higher orders and no symbols are
available to represent or denote geometrical figures (point sets), classes
of geometrical figures, etc. It should be clear that, nevertheless, we are
able to express in our symbolism all the results which can be found in
textbooks of elementary geometry and which are formulated there in
terms referring to various special classes of geometrical figures, such as
the straight lines, the circles, the segments, the triangles, the quadrangles,
and, more generally, the polygons with a fixed number of vertices, as
well as to certain relations between geometrical figures in these classes,
such as congruence and similarity. This is primarily a consequence of the
fact that, in each of the classes just mentioned, every geometrical figure
is determined by a fixed finite number of points. For instance, instead of
saying that a point z lies on the straight line through the points x and y,
we can state that either ft(xyz) or fi(yzx) or fi(zxy) holds; instead of saying
that two segments with the end-points x, y and x',yr are congruent, we
simply state that d(xyx'yr). 3
3 In various formalizations of geometry (whether elementary or not) which are
known from the literature, and in particular in all those which follow the lines of
[1], not only points but also certain special geometrical figures are treated 'as
18 ALFRED TARSKI
A sentence formulated in our symbolism is regarded as valid if it follows
(semantically) from sentences adopted as axioms, i.e., if it holds in every
mathematical structure in which all the axioms hold. In the present case,
by virtue of the completeness theorem for elementary logic, this amounts
to saying that a sentence is valid if it is derivable from the axioms by
means of some familiar rules of inference. To obtain an appropriate set
of axioms, we start with an axiom system which is known to provide an
adequate basis for the whole of Euclidean geometry and contains /? and d
as the only non-logical constants. Usually the only non-elementary
sentence in such a system is the continuity axiom, which contains second-
order variables X, Y, ... ranging over arbitrary point sets (in addition to
first-order variables %, y, ... ranging over points) and also an additional
logical constant, the membership symbol e denoting the membership
relation between points and point sets. The continuity axiom can be
formulated, e.g., as follows:
A XY{V z A xy[x e X A y e Y -> p(zxy)]
-> V w A #y [xEXhyeY-+ p(xuy)]}.
We remove this axiom from the system and replace it by the infinite
collection of all elementary continuity axioms, i.e., roughly, by all the
sentences which are obtained from the non-elementary axiom if x E X is
replaced by an arbitrary elementary formula in which % occurs free, and
y E Y by an arbitrary elementary formula in which y occurs free. To fix
the ideas, we restrict ourselves in what follows to the two-dimensional
individuals and are represented by first-order variables; usually the only figures
treated this way are straight lines, planes, and, more generally, linear subspaccs.
The set-theoretical relations of membership and inclusion, between a point and a
special geometrical figure or between two such figures, arc replaced by the geo-
metrical relation of incidence, and the symbol denoting this relation is included in
the list of primitive symbols of geometry. All other geometrical figures are treated
as point sets and can be represented by second-order variables (assuming that the
system of geometry discussed is provided with a set-theoretical basis). This ap-
proach has some advantages for restricted purposes of projective geometry; in fact,
it facilitates the development of projective geometry by yielding a convenient
formulation of the duality principle, and leads to a subsumption of this geometry
under the algebraic theory of lattices. In other branches of geometry an analogous
procedure can hardly be justified; the non-uniform treatment of geometrical
figures seems to be intrinsically unnatural, obscures the logical structure of the
foundations of geometry, and leads to some complications in the development of
this discipline (by necessitating, e.g., a distinction between a straight line and the
set of all points on this line).
WHAT IS ELEMENTARY GEOMETRY? 19
elementary geometry and quote explicitly a simple axiom system ob-
tained in the way just described. The system consists of twelve individual
axioms, A1-A2, and the infinite collection of all elementary continuity
axioms, A 13.
Al [IDENTITY AXIOM FOR BETWEENNESS].
A xy[0(xyx) -> (x = y)]
A2 [TRANSITIVITY AXIOM FOR BETWEENNESS].
A xyzu[(i(xyu) A ft(yzu) -> ft(xyz)]
A3 [CONNECTIVITY AXIOM FOR BETWEENNESS].
A xyzu[p(xyz) A f$(xyu) A (x ^ y) -> fi(xzu) v f$(xuz)]
A4 [REFLEXIVITY AXIOM FOR EQUIDISTANCE].
A xy[d(xyyx)]
A5 [IDENTITY AXIOM FOR EQUIDISTANCE].
A xyz[6(xyzz) -> (x = y)]
A6 [TRANSITIVITY AXIOM FOR EQUIDISTANCE].
A xyzuvw[d(xyzu) A d(xyvw) -> d(zuvw)]
A7 [PASCH'S AXIOM].
A txyzu V v[ft(xtu) A ft(yuz) -+p(xvy) A /5(^)]
A8 [EUCLID'S AXIOM].
A txyzu V vw[fi(xiit) A jft(yw2) A (A: ^ w) -> p(xzv) A p(xyw) A fl(vtw)]
A9 (FIVE-SEGMENT AXIOM).
A ^^'yy'^'w^'f^^y^'y') A (5(y2;yy) A d(xux'u'} A d(yuy'u')
A ^(%y^) A jff^'y'a:') A (* ^ y) -> 6(zuz'u')]
A 10 (AXIOM OF SEGMENT CONSTRUCTION).
A xyuv V z[f$(xyz) A <5(y2wz;)]
Al 1 (LOWER DIMENSION AXIOM).
V xyz[^(xyz) A -j(yzx) A -^(^)]
A 1 2 (UPPER DIMENSION AXIOM) .
A xyzuv[d(xuxv) A ^(ywyv) A 6(zuzv) A (u ^= v)
^ p(xyz) v P(yzx) v 0(zxy)]
20 ALFRED TARSKI
A13 [ELEMENTARY CONTINUITY AXIOMS]. All sentences of the form
A vw . . . {V z A xy[<p A \p -> fi(zxy)] -> V u A #y[g? A ^
z£>A0r0 99 stands for any formula in which the variables x, v, w, . . . , &«/
neither y nor z nor u, occur free, and similarly for ip, with x and y
interchanged.
Elementary geometry based upon the axioms just listed will be denoted
by <^2- In Theorems 1-4 below we state fundamental metamathematical
properties of this theory. 4
First we deal with the representation problem for <^2, i.e., with the
problem of characterizing all models of this theory. By a model of $2 we
understand a system 9ft — </I, B, Dy such that (i) A is an arbitrary non-
empty set, and B and D are respectively a ternary and a quaternary
relation among elements of A ; (ii) all the axioms of <f 2 prove to hold in -JJl
if all the variables are assumed to range over elements of A, and the
constants /? and 6 are understood to denote the relations B and D, re-
spectively.
The most familiar examples of models of ^2 (and ones which can
easily be handled by algorithmic methods) are certain Cartesian spaces
over ordered fields. We assume known under what conditions a system
g — <F, + ,-,<> (where F is a set, + and • are binary operations
under which F is closed, and < is a binary relation between elements of F)
is referred to as an ordered field and how the symbols 0, x — y, x2 are
defined for ordered fields. An ordered field 3f will be called Euclidean if
every non-negative element in F is a square; it is called real closed if it is
Euclidean and if every polynomial of an odd degree with coefficients in F
has a zero in F. Consider the set A% — F x F of all ordered couples
4 A brief discussion of the theory ^2 and its metamathematical properties was
given in [7], pp. 43 ff. A detailed development (based upon the results of [7]) can be
found in [4] — where, however, the underlying system of elementary geometry
differs from the one discussed in this paper in its logical structure, primitive sym-
bols, and axioms.
The axiom system for <?2 quoted in the text above is a simplified version of the
system in [7J, pp. 55 f. The simplification consists piimarily in the omission of
several superfluous axioms. The proof that those superfluous axioms are actually
derivable from the remaining ones was obtained by Eva Kallin, Scott Taylor, and
the author in connection with a course in the foundations of geometry given by the
author at the University of California, Berkeley, during the academic year 1956-57.
WHAT IS ELEMENTARY GEOMETRY? 21
% = <#i, #2> with #1 and #2 in F. We define the relations B% and D%
among such couples by means of the following stipulations :
B%(xyz) if and only if (xi — yi)-(ya - z2) = (x2 - y2)-(yi — *i),
0 < (xi — yi)-(yi — 2:1), 0nd 0 < (*2 — y2)-(y2 — 22) ;
D9(xyzu) if and only if (xi — yi)2 + (*2 — y2)2 = (*i — ui)2+(z2— U2)2.
The system $2(1$) = <A%, B%, Dg)> is called the (two-dimensional)
Cartesian space over $. If in particular we take for $ the ordered field 9ft
of real numbers, we obtain the ordinary (two-dimensional) analytic space
THEOREM 1 (REPRESENTATION THEOREM). For W, to be a model of <^2 it is
necessary and sufficient that 9K be isomorphic with the Cartesian space
Ea(3f) over some real closed field $.
PROOF (in outline). It is well known that all the axioms of <^2 hold in
62(8?) and that therefore (£2(3?) is a model of ^2. By a fundamental result
in [7], every real closed field g is elementarily equivalent with the field 91,
i.e., every elementary (first-order) sentence which holds in one of these
two fields holds also in the other. Consequently every Cartesian space
(£2©) ovcr a real closed field gf is elementarily equivalent with E2(9?) and
hence is a model of ^2; this clearly applies to all systems 2R isomorphic
with S2@) as well.
To prove the theorem in the opposite direction, we apply methods and
results of the elementary geometrical theory of proportions, which has
been developed in the literature on several occasions (see, e.g., [1J, pp.
51 if.). Consider a model Wl = <A, B, Z)> of <^2; let z and u be any two
distinct points of A, and F be the straight line through z and u, i.e., the
set of all points x such that B(zux) or B(uxz) or B(xzu). Applying some
familiar geometrical constructions, we define the operations + and • on,
and the relation < between, any two points x and y in F. Thus we say
that x < y if either x = y or else B(xzu) and not B(yxu) or, finally,
5 All the results in this paper extend (with obvious changes) to the w-dimensional
case for any positive integer n. To obtain an axiom system for tfn we have to modify
the two dimension axioms, Al 1 and A 12, leaving the remaining axioms unchanged;
by a result in [5] ,A1 1 and A 12 can be replaced by any sentence formulated in the
symbolism of &n which holds in the ordinary w-dimensional analytic space but not
in any m-dimensional analytic space for m & n. In constructing algebraic models
for one-dimensional geometries we use ordered abelian groups instead of ordered
fields.
22 ALFRED TARSKI
B(zxy) and not B(xzu) ; x + y is defined as the unique point v in F such
that D(zxyv) and either z < x and y <^ t; or else % < z and v < y. The
definition of #-y is more involved; it refers to some points outside of F
and is essentially based upon the properties of parallel lines. Using ex-
clusively axioms A 1 -A 12 we show that $ = <F, +, ', <>> is an ordered
field; with the help of A 13 we arrive at the conclusion that $ is actually
a real closed field. By considering a straight line G perpendicular to F at
the point z, we introduce a rectangular coordinate system in 3D? and we
establish a one-to-one correspondence between points x, y, ... in A and
ordered couples of their coordinates x — <#i, #2), y = <yi, ^2), ... in
F x F. With the help of the Pythagorean theorem (which proves to be
valid in ^2) we show that the formula
D(xyst)
holds for any given points x, y, ... in A if and only if the formula
holds for the correlated couples of coordinates x = <#i, #2), y =<yi,
. . . in F x F, i.e., if
an analogous conclusion is obtained for B(xys). Consequently, the
systems 3R and 62(8) are isomorphic, which completes the proof.
We turn to the completeness problem for <^2- A theory is called complete
if every sentence a (formulated in the symbolism of the theory) holds
either in every model of this theory or in no such model. For theories
with standard formalization this definition can be put in several other
equivalent forms; we can say, e.g., that a theory is complete if, for every
sentence or, either a or -ic1 is valid, or if any two models of the theory are
elementarily equivalent. A theory is called consistent if it has at least one
model; here, again, several equivalent formulations are known. If there
is a model 9K such that a sentence holds in 551 if and only if it is valid in the
given theory, then the theory is clearly both complete and consistent,
and conversely. The solution of the completeness problem for $2 is given
in the following
THEOREM 2 (COMPLETENESS THEOREM), (i) A sentence formulated in 6°%
is valid if and only if it holds in (£2 (9ft) ;
(ii) the theory $2 is complete (and consistent).
WHAT IS ELEMENTARY GEOMETRY? 23
Part (i) of this theorem follows from Theorem 1 and from a funda-
mental result in [7] which was applied in the proof of Theorem 1 ; (ii) is an
immediate consequence of (i).
The next problem which will be discussed here is the decision problem
for $2. It is the problem of the existence of a mechanical method which
enables us in each particular case to decide whether or not a given sen-
tence formulated in <^2 is valid. The solution of this problem is again
positive :
THEOREM 3 (DECISION THEOREM). The theory #2 is decidable.
In fact, & 2 is complete by Theorem 2 and is axiomatizable by its very
description (i.e., it has an axiom system such that we can always decide
whether a given sentence is an axiom). It is known, however, that every
complete and axiomatizable theory with standard formalization is deci-
dable (cf., e.g., [9], p. 14), and therefore $2 is decidable. By analyzing the
discussion in [7] we can actually obtain a decision method for $2-
The last metamathematical problem to be discussed for $% is the
problem of finite axiomatizability. From the description of <f 2 we see that
this theory has an axiom system consisting of finitely many individual
axioms and of an infinite collection of axioms falling under a single axiom
schema. This axiom schema (which is the symbolic expression occurring
in A 13) can be slightly modified so as to form a single sentence in the
system of predicate calculus with free variable first-order predicates, and
all the particular axioms of the infinite collection can be obtained from
this sentence by substitution. We briefly describe the whole situation by
saying that the theory <f 2 is "almost finitely axiomatizable", and we now
ask the question whether $2 is finitely axiomatizable in the strict sense,
i.e., whether the original axiom system can be replaced by an equivalent
finite system of sentences formulated in $2- The answer is negative:
THEOREM 4 (NON-FINITIZABILITY THEOREM). The theory $2 is not
finitely axiomatizable.
PROOF (in outline). From the proof of Theorem 1 it is seen that the
infinite collection of axioms A 1 3 be can equivalently replaced by an infinite
sequence of sentences So, . . . , Sw, . . . ; So states that the ordered field g
constructed in the proof of Theorem 1 is Euclidean, and Sn for n > 0
expresses the fact that in this field every polynomial of degree 2n + 1
has a zero. For every prime number p we can easily construct an ordered
24 ALFRED TARSUI
field $p in which every polynomial of an odd degree 2n + 1 < p has a
zero while some polynomial of degree p has no zero; consequently, if
2m + 1 = p is a prime, then all the axioms A1-A12 and Sn with n < m
hold in £2® p) while Sm does not hold. This implies immediately that the
infinite axiom system A 1 , . . . , A 1 2, So, . . . , Sn> • • • has no finite sub-
system from which all the axioms of the system follow. Hence by a simple
argument we conclude that, more generally, there is no finite axiom
system which is equivalent with the original axiom system for $2-
From the proof just outlined we see that $2 can be based upon an
axiom system Al, . . . , A 12, So, . . ., Sw, ... in which (as opposed to the
original axiom system) each axiom can be put in the form of either a
universal sentence or an existential sentence or a universal-existential
sentence; i.e., each axiom is either of the form
A xy . . . (<p)
or else of the form
V uv . . . ((p)
or, finally, of the form
A xy . . . V uv . . . (<p)
where <p is a formula without quantifiers. A rather obvious consequence
of this structural property of the axioms is the fact that the union of a
chain (or of a directed family) of models of <^2 is again a model of $2- This
consequence can also be derived directly from the proof of Theorem 1 .
The conception of elementary geometry with which we have been
concerned so far is certainly not the only feasible one. In what follows we
shall discuss briefly two other possible interpretations of the term
"elementary geometry" ; they will be embodied in two different formalized
theories, <f 2' and <f 2" '•
The theory $2 is obtained by supplementing the logical base of $2
with a small fragment of set theory. Specifically, we include in the
symbolism of <£V new variables X, Y , . . . assumed to range over arbitrary
finite sets of points (or, what in this case amounts essentially to the same,
over arbitrary finite sequences of points) ; we also include a new logical
constant, the membership symbol e, to denote the membership relation
between points and finite point sets. As axioms for <£V we again choose
A 1 -A 13; it should be noticed, however, that the collection of axiom A 13
WHAT IS ELEMENTARY GEOMETRY? 25
is now more comprehensive than in the case of $2 since <p and y stand for
arbitrary formulas constructed in the symbolism of <^y. In consequence
the theory &% considerably exceeds <f 2 in means of expression and power.
In $2 we can formulate and study various notions which are traditionally
discussed in textbooks of elementary geometry but which cannot be
expressed in $2', e.g., the notions of a polygon with arbitrarily many
vertices, and of the circumference and the area of a circle.
As regards metamathematical problems which have been discussed
and solved for $2 in Theorems 1-4, three of them — the problems of
representation, completeness, and finite axiomatizability — are still open
when referred to <^2'. In particular, we do not know any simple character-
ization of all models of $2, nor, do we know whether any two such
models are equivalent with respect to all sentences formulated in $2 -
(When speaking of models of <^y we mean exclusively the so-called
standard models; i.e., when deciding whether a sentence a formulated in
$2' holds in a given model, we assume that the variables x, y, ... oc-
curring in a range over all elements of a set, the variables X, Y, ... range
over all finite subsets of this set, and e is always understood to denote the
membership relation) . The Archimedean postulate can be formulated and
proves to be valid in <^y. Hence, by Theorem 1, every model of <^y is
isomorphic with a Cartesian space 62®) over some Archimedean real
closed field $. There are, however, Archimedean real closed fields $ such
that 62©) is n°t a niodel of $2 ', e.g., the field of real algebraic numbers is
of this kind. A consequence of the Archimedean postulate is that every
model of 6*2 has at most the power of the continuum (while, if only by
virtue of Theorem 1, $2 has models with arbitrary infinite powers). In
fact, $2 has models which have exactly the power of the continuum, e.g.,
&2(ffi), but it can also be shown to have denumerable models. Thus,
although the theory $2 may prove to be complete, it certainly has non-
isomorphic models and therefore is not categorical. 6
Only the decision problem for $2 has found so far a definite solution :
8 These last remarks result from a general metamathematical theorem (an
extension of the Skolem-Lowenheim theorem) which applies to all theories with the
same logical structure as <£V, i.e., to all theories obtained from theories with stan-
dard formalization by including new variables ranging over arbitrary finite sets and
a new logical constant, the membership symbol e, and possibly by extending
original axiom systems. By this general theorem, if &" is a theory of the class just
described with at most ft different symbols, and if a mathematical system 9JI is a
26 ALFRED TARSKI
THEOREM 5. The theory #2' is undecidable, and so are all its consistent
extensions.
This follows from the fact that Peano's arithmetic is (relatively) inter-
pretable in <f2'; cf. [9], pp. 31 ff.
To obtain the theory $2" we leave the symbolism of $2 unchanged but
we weaken the axiom system of £2- In fact, we replace the infinite
collection of elementary continuity axioms, A 13, by a single sentence,
A 13', which is a consequence of one of these axioms. The sentence ex-
presses the fact that a segment which joins two points, one inside and one
outside a given circle, always intersects the circle; symbolically:
A 13'. A xyzx'z'u V y'[6(uxux') A d(uzuz') A (t(uxz) A ft(xyz)
-v d(uyuy') A ftx'y'z')]
As a consequence of the weakening of the axiom system, various
sentences which are formulated and valid in $2 are no longer valid in $2".
This applies in particular to existential theorems which cannot be esta-
blished by means of so-called elementary geometrical constructions
(using exclusively ruler and compass), e.g., to the theorem on the tri-
section of an arbitrary angle.
With regard to metamathematical problems discussed in this paper the
situation in the case of $2" is just opposite to that encountered in the
case of <^y. The three problems which are open for <f 2' admit of simple
solutions when referred to ^2". In particular, the solution of the repre-
sentation problem is given in the following
standard model of y with an infinite power a, then 9Ji has subsystems with any
infinite power y, p<y <<x, which are also standard models of y. The proof of this
theorem (recently found by the author) has not yet been published; it differs but
slightly from the proof of the analogous theorem for the theories with standard
formalization outlined in [10], pp. 92 f. In opposition to theories with standard
formalization, some of the theories &~ discussed in this footnote have models with
an infinite power a and with any smaller, but with no larger, infinite power; an
example is provided by the theory &%' for which a is the power of the continuum.
In particular, some of the theories y have exclusively denumerable models and in
fact are categorical; this applies, e.g., to the theory obtained from Peano's arith-
metic in exactly the same way in which ^V has been obtained from $%. There are
also theories y which have models with arbitrary infinite powers; such is, e.g., the
theory <f 2'" mentioned at the end of this paper.
WHAT IS ELEMENTARY GEOMETRY? 27
THEOREM 6. For 2ft to be a model of $2" it is necessary and sufficient that
$R be isomorphic with the Cartesian space 62(8) over some Euclidean field f^f.
This theorem is essentially known from the literature. The sufficiency
of the condition can be checked directly; the necessity can be established
with the help of the elementary geometrical theory of proportions (cf . the
proof of Theorem 1).
Using Theorem 6 we easily show that the theory <f 2" is incomplete,
and from the description of ^2" we see at once that this theory is finitely
axiomatizable.
On the other hand, the decision problem for <£y remains open and
presumably is difficult. In the light of the results in [2] it seems likely that
the solution of this problem is negative ; the author would risk the (much
stronger) conjecture that no finitely axiomatizable subtheory of <^2 is
decidable. If we agree to refer to an elementary geometrical sentence (i.e.,
a sentence formulated in $ 2) as valid if it is valid in $2, and as elementarily
provable if it is valid in $2", then the situation can be described as
follows : we know a general mechanical method for deciding whether a given
elementary geometrical sentence is valid, but we do not, and probably shall
never know, any such method for deciding whether a sentence of this sort is
elementarily provable.
The differences between $ 2 and «f 2" vanish when we restrict ourselves
to universal sentences. In fact, we have
THEOREM 7. A universal sentence formulated in $2 is valid in $2 if
only if it is valid in $2".
To prove this we recall that every ordered field can be extended to a
real closed field. Hence, by Theorems 1 and 6, every model of $2" can be
extended to a model of £%. Consequently, every universal sentence
which is valid in $2 is also valid in $2" \ the converse is obvious. (An even
simpler proof of Theorem 7, and in fact a proof independent of Theorem 1 ,
can be based upon the lemma by which every finite subsystem of an
ordered field can be isomorphically embedded in the ordered field of real
numbers.)
Theorem 7 remains valid if we remove A 13' from the axiom system of
$2" (and it applies even to some still weaker axiom systems). Thus we see
that every elementary universal sentence which is valid in $2 can be
proved without any help of the continuity axioms. The result extends to
28 ALFRED TARSKI
all the sentences which may not be universal when formulated in <^2 but
which, roughly speaking, become universal when expressed in the
notation of Cartesian spaces (£2$)-
As an immediate consequence of Theorems 3 and 7 we obtain:
THEOREM 8. The theory $2" is decidable with respect to the set of its
universal sentences.
This means that there is a mechanical method for deciding in each
particular case whether or not a given universal sentence formulated in
the theory $2" holds in every model of this theory.
We could discuss some further theories related to $2, &2, and <f2";
e.g., the theory <£y" which has the same symbolism as <f2' and the same
axiom system as <f2". The problem of deciding which of the various
formal conceptions of elementary geometry is closer to the historical
tradition and the colloquial usage of this notion seems to be rather
hopeless and deprived of broader interest. The author feels that, among
these various conceptions, the one embodied in <^2 distinguishes itself by
the simplicity and clarity of underlying intuitions and by the harmony
and power of its metamathematical implications.
Bibliography
[1] HILBERT, D., Grundlagen der Geometrie. Eighth edition, with revisions and
supplements by P. BERNAYS, Stuttgart 1956, III -f- 251 pp.
[2] ROBINSON, J., Definability and decision problems in arithmetic. Journal of
Symbolic Logic, vol. 14 (1949), pp. 98-114.
[3] ROBINSON, R. M., Binary relations as primitive notions in elementary geometry
This volume, pp. 68-85.
[4] SCHWABHAUSER, W., Vber die Vollstdndigkeit der elementaren euklidischen
Geometrie. Zeitschrift fur mathematische Logik und Grundlagen der Mathc-
matik, vol. 2 (1956), pp. 137-165.
[5] SCOTT, D., Dimension in elementary Euclidean geometry. This volume, pp.
53-67.
[6] SZMIELEW, W., Some metamathematical problems concerning elementary hyper-
bolic geometry. This volume, pp. 30-52.
[7] TARSKI, A., A decision method for elementary algebra and geometry. Second
edition, Berkeley and Los Angeles 1951, VI -f 63 pp.
[8] Contributions to the theory of models. Indagationes Mathematicae, vol.
16 (1954), pp. 572-588, and vol. 17 (1955), pp. 56-64.
WHAT IS ELEMENTARY GEOMETRY? 29
[9] MOSTOWSKI, A., and ROBINSON, R. M., Undeciddble theories. Amsterdam
1953, XI + 98 pp.
[10] and VAUGHT, R. L., Arithmetical extensions of relational systems. Com-
positio Mathematica, vol. 13 (1957), pp. 81-102.
[11] VAN DER WAERDEN, B. L., Modern Algebra. Revised English edition, New
York, 1953, vol. 1, XII + 264 pp.
Symposium on the Axiomatic Method
SOME METAMATHEMATICAL PROBLEMS CONCERNING
ELEMENTARY HYPERBOLIC GEOMETRY
WANDA SZMIELEW
University of Warsaw, Warsaw, Poland, and Institute for Basic Research in Science,
University of California, Berkeley, California, U.S.A.
Introduction. In this paper we shall be concerned with a formalized
system J^n of elementary n-dimensional hyperbolic (Bolyai-Lobachevskiari)
geometry. Throughout the paper we shall use the notation introduced by
Tarski in [5]. In particular the system 3Pn has the same logical structure
and the same symbolism as Tarski's system $2 of elementary Euclidean
geometry. In case n = 2 it differs from <f 2 only in that Euclid's axiom,
A8, has been replaced by its negation; for n > 2 the dimension axioms,
Al 1 and A 12, should, in addition, be appropriately modified. The aim of
this paper is to extend to the system 3tf n the fundamental metamathe-
matical results stated in [5] for the system ^2- 1
The paper is divided into three sections. In Section 1 we shall indicate
how the solutions of the metamathematical problems in which we are
interested can be obtained by means of a familiar algorithm, the end-
calculus of Hilbert (cf. [1] , pp. 159ff.). In Section 2 we shall construct
a new geometrical algorithm, the hyperbolic calculus of segments, which
will prove to provide a convenient apparatus for a new solution of the
same problems. The results established in Sections 1 and 2 have inter-
esting implications for some related geometrical systems, in fact, for
elementary absolute geometry (i.e., the common part of elementary
Euclidean and hyperbolic geometries) and for non-elementary hyperbolic
geometry. These implications will be discussed in Section 3. 2
1. Hilbert-Sz&sz Spaces. In [1] Hilbert gives an outline of his end-
calculus, defines in its terms the coordinates of straight lines and points
1 The results of this paper were obtained while the author was working in the
University of California, Berkeley on a research project in the foundations of mathe-
matics sponsored by the U.S. National Science Foundation.
2 All the observations which will be given in Section 1 and those concerning
absolute geometry in Section 3 have been made jointly by Tarski and the author.
30
ELEMENTARY HYPERBOLIC GEOMETRY 31
and establishes an analytic condition for a point to lie on a straight line.
The whole discussion is done in a system of hyperbolic geometry included
in ^2- In [3] Szasz somewhat modifies Hilbert's construction and moreover
establishes an analytic formula for the distance between two points.
The latter formula is essential for our purposes and therefore we shall
refer in what follows to [3] and not to [1].
The discussion in [3] leads to an important class of models of ^2 which
can be obtained by means of the following algebraic construction: Con-
sider an arbitrary ordered field ft = <f, +, -, <>. For any ordered
triples x = <#i, x2, #3) and y = <yi, yz, yz> in F X F X F let
0(x, y) == xi-yi + x2-y2 — xz-y*.
By A % we denote the subset of F x F x F consisting of all those triples
x, for which
0(x, x) = — 1 and #3 > 0.
By Bft (the betweenness relation) we denote the ternary relation which
holds among the triples x, y, z in A% if and only if
0(u,u) > 0, 0(u,x) — 0, 0(u,y) = 0, 0(u,z) = 0 for some u e F x F x F
and moreover
0(x9 y) ^ 0(x, z) and 0(y, z) ^ 0(x, z).
Finally, by D% (the equidistance relation) we denote the quaternary
relation which holds among the triples x, y, z, u in A$ if and only if
0(x,y) =0(z,u).
The system <A%, B^, D^y thus obtained will be denoted by §2(8) anc*
will be referred to as the two-dimensional Hilbert-Szdsz space over the
field ft.
As a direct consequence of Szasz' discussion the following result is
obtained: Every model of Jf'2 is isomorphic with the space §2©) over
some Euclidean field ft. By supplementing the argument of Szasz we
easily show, by means of the elementary continuity axioms, A 13 (see
[5], p. 20), that the field ft is real closed. Since Jf 2 has a model, then for
some real closed field ft the space i$2(ft) is a model of Jf 2- And since, by a
fundamental result of Tarski in [4], any two real closed fields ft' and ft"
are elementarily equivalent, so are also spaces $t)2(ft') and $2(ft"), and
32 WANDA SZMIELEW
consequently each of the spaces §2®) i§ a model of ,#2. This clearly
applies to all the systems isomorphic with Jp2($) as well. Thus we have
arrived at the following
THEOREM 1.1. (REPRESENTATION THEOREM). A system 90? = <A, B, D>
is a model of ^2 if and only if it is isomorphic with the Hilbert-Szdsz space
§2(3r) = <A<g, B%, Z)^> over some real closed field g.
Theorem 1 . 1 implies as a corollary
THEOREM 1 .2. The theory Jtifz is complete and decidable but not finitely
axiomatizable.
The proof of Theorem 1.2 is quite analogous to the proof of the corre-
sponding results (Theorems 2, 3, 4) for the system ^2 in [5].
In this way we have established the fundamental metamathematical
properties of two-dimensional elementary hyperbolic geometry. The
extension of these results to w-dimensional geometries does not seem to
present any essential difficulty.
2. Klein Spaces and Hyperbolic Calculus of Segments. In this section we
wish to establish fundamental metamathematical properties of C/C n by
using in the representation theorem, instead of the Hilbert-Szasz models,
the much more familiar and intuitively simpler Klein models. We could
try to derive the new representation theorem from the old one by showing
in a purely algebraic way that every Klein model is isomorphic with some
Hilbert-Szasz model, and conversely. We prefer, however, to obtain this
result by means of a direct procedure, and to this end we construct a
special geometrical algorithm, which will be called the hyperbolic calculus
of free segments. This algorithm seems to present some geometrical
interest independent of any metamathematical applications and to be
conceptually simpler than the end-calculus of Hilbert.
Consider a model 9CR — </l , B, Dy of Jf *n (n ^ 2) formed by an arbitrary
set A, a ternary relation B (the betweenness relation), and a quaternary
relation D (the equidistance relation) among elements (points) of A . By a
segment we understand any non-ordered couple pq of two distinct points
p, q in A. Two segments pq and rs are congruent (in symbols, Pq^. rs) if
and only if D(pqrs). The set of all segments congruent to a given segment
pq is called the free segment determined by pq and is denoted by [pq]. Free
segments will be represented by variables X, Y, Z, ... and the set of all
free segments will be denoted by 5. We wish to define a binary relation
ELEMENTARY HYPERBOLIC GEOMETRY
33
^ between elements of S and two binary operations + and • on elements
of S in such a way that the rectangular coordinates introduced on the
base of the resulting calculus function as the Beltrami coordinates, and,
in fact, lead to Klein model.
To obtain appropriate definitions let us assume for a while that 9ft is a
model, not only of ^2, but of full two-dimensional hyperbolic geometry
with the non-elementary axiom of continuity (e.g. the ordinary Klein
model) . As is well known, in such a model 9K we can correlate with every
angle PQ a real number p(PQ), 0 < p(PQ) < n, called the measure of PQ.
The angle PQ is understood here as the non-ordered pair of half-lines P
and Q which are supposed to be non-collinear and to have a common
origin. Hence we can define in -JR the Lobachevskian function 77, which
n
assigns a real number H(X}t 0 < II(X} < , to every free segment X.
In fact, given an oriented straight line L (Figure 1), a point p not on L,
A7(X)
Fig. 1
the perpendicular projection q of p upon L, and the half-line P with
origin p and parallel to L, if Q is the half-line fq and X = [pq], then
TI(X) = p(PQ). The Beltrami coordinates of an arbitrary point p of the
model SO1?, if different from 0, are numbers of the form ±coslJ(X),
± cos/7(Y), where X and Y are two free segments correlated with
point p. Using this fact we define the relation < and the operations +
34
WANDA SZMIELEW
and • for elements of S by the following conditions
(I) X :< Y if and only if cos H(X) < cos IJ(Y),
(II) X + Y = Z if and only if cos H(X) + cos 77( Y) = cos 77(Z),
(III) X • Y = Z if and only if cos U(X) - cos 77(Y) = cos U(Z).
We shall show that these definitions can be replaced by equivalent
ones formulated entirely in terms of the relations B and D. This will
make it possible to extend the definitions to an arbitrary model 9K.
Relation ^. In view of (I) (since both functions, cos in the interval
f 0, — J and 77, are decreasing) :< is the ordinary less than or equal to
relation ; speaking precisely
(I') X :< y if and only if B(pqr), [pq] = X, and [pr] = Y, for some
p,q,r e A.
As usual the symbol ^ will denote the relation converse to ^.
In defining the operations + and •, and in deducing their fundamental
properties we shall use the notions of a proper or improper right triangle
and of a proper or improper right quadrangle (i.e., a quadrangle with
three right angles). For our purposes it is convenient to introduce these
notions in the following way:
Given three non-collinear points p, q, r, we say that the ordered triple
pqr is a (proper) right triangle if and only if <£ pqr is a right angle.
Fig. 2
Given two distinct points p and q, a half-line P with origin p, and a
half -line Q with origin q (Figure 2), we say that the ordered quadruple
ELEMENTARY HYPERBOLIC GEOMETRY
35
PpqQ is an improper right triangle if and only if the half-lines qp and Q
form a right angle and P\\Q. It is clear that points p and q uniquely
determine half -lines P and Q.
Given four points p, q, r, s, no three of which are collinear, we say that
the ordered quadruple Pqrs is a (proper) right quadrangle if and only if
<£ spq, <£ pqr, and <£ qrs are three right angles. It is clear that there are
non-coil inear points p, q, r, such that <£ pqr is a right angle and for which
there is no point s such that Pqrs is a right quadrangle.
Given three distinct points
p, q, r, a half -line P with
origin p, and a half-line R
with origin r (Figure 3), we
say that the ordered quin-
tuple PpqrR is an improper
right quadrangle if and only
if half-lines P and pq, qp
and qr, rq and R form three
right angles and P\\R. It is
clear that points p and q
determine uniquely the half-
line P, the point r, and
the half-line R.
Before defining the operations $ and • in terms of the relations B and D
we first introduce four auxiliary operations on elements of 5, in fact, two
binary operations, $ and 0, and two unary operations, R and C.
Fig. 3
36
WANDA SZMIELEW
Operation $. Given two free segments X and Y, consider the free seg-
ment Z constructed in the following way: For some right quadrangle
pqrst let X = [pq], Y = [qr], and Z = [qs] (Figure 4). Clearly, the seg-
ment Z thus defined not always exists (since the right quadrangle pqrs
not always can be constructed). If however Z exists, it is uniquely
determined by X and Y (independent of the choice of pqrs) and we then
put X $ Y = Z. To express the fact that X $ Y does, or does not
exist, we shall respectively write X ®Y E S, X ® Y <£ S.
Operation 0. 3 Given two free segments X and Y, we consider a right
triangle pqr with X = [pq] and Y = [qr] (Figure 5), and we put X 0 Y =
X& Y
[pr]. The operation © thus defined is always performable, i.e., we have
X 0 Y e S for any X, Y e S.
It is worth while to notice that both the operations $ and 0 have
sense in absolute geometry and that they coincide in Euclidean geometry.
Operation R. Given a free segment X, we consider an isoceles right
triangle pqr with X = [pr] (Figure 6), and we put RX = [pq] = [pr].
Clearly the operation R is always performable. RX can be referred to
as the square root of X.
Operation C. Given a free segment X, we consider an improper right
quadrangle PpqrR with X = [pq] (Figure 7), and we put CX — [qr].
3 This operation was studied by Hjelmslev in [2].
ELEMENTARY HYPERBOLIC GEOMETRY
37
Obviously the operation C is always performable. CX can be referred to
as the complement of X.
RX
II (CX)
3 CX r
Fig. 7
Clearly, the four operations just defined can be characterized in terms
of the primitive relations B and D.
Using some well known theorems of hyperbolic geometry we can
38 WANDA SZMIELEW
easily establish in 9R the formulas
cos2 H(X) + cos2 77( Y) = coS2 H(X $ Y),
sin H(X) • sin 77( Y) = sin II (X 0 Y),
By definitions (II) and (III) these formulas imply at once the following
equivalences :
(II') X + Y = Z if and only if CRCX $ CRCY = CRCZ,
(III') X • Y = Z if and only if CX 0 CY = CZ.
We now return to the original model 9K of 3J?n. In this model we
introduce the auxiliary operations $, 0, jR, C (in the definitions of $
and /? it should be additionally mentioned that ^grs and PpqrR are qua-
drangles on a plane) and assume equivalences (I'), (II'), (III') as defi-
nitions of ^, +, •.
We shall now establish the fundamental properties of the system
@ = <5, +, •, ^>. A detailed discussion will be given only for the case
n > 3, thus using (when needed) three-dimensional constructions. Some
remarks concerning the case n = 2 will be given later.
In Lemma 2. 1 we state some fundamental properties of the relation ^
and the auxiliary operations.
LEMMA 2.1. The system <5, $, 0, R, C, ^> satisfies the following
conditions :
(i) <£, ^> is a non-empty simply ordered system',
(ii) ifX, Y,X®YeS,thenX$Y= Y®X\
(iii) if X,Y,Z,X QY, (X $ Y) $ Z e 5, then YQZeS and
(X9 Y) <$Z = XQ(Y QZ)',
(iv) */ X, Z <= 5, then X :< Z if and only if X = Z or else X $ Y = Z
for some Y e S ;
(v) *y X, Y e 5, *Aen X0Ye5am*X0Y=Y0X;
(vi) ifX, Y,ZeS, then (X ® Y) 0Z = X0(Y0Z);
ELEMENTARY HYPERBOLIC GEOMETRY
39
(vii) if X,ZeS, then X < Z if and only if X = Z or else X 0 Y = Z
for some Y e S ;
(viii) if XeS, then RX e S and RX 0 RX = X ;
(ix) if X, Y e S, then X < Y if and only if RX ^ RY]
(x) ifXeS, then CXeS and CCX = X]
(xi) if X, Y e S, then X < Y if and only if CX>CY\
(xii) if XeS, then X®CX$S\
(xiii) if X} Z E S and Z < CX, then X $ Z e S.
PROOF. All the postulates (i)-(xiii) with the exception of (iii) and (vi)
result immediately from the definitions of the notions involved.
Fig. 8
To derive Postulate (iii) (the associative law for $) let X, Y , Z E S and let
p, q, r, s, t, u, w be seven distinct points (Figure 8) satisfying the following
conditions: (a) [pq] = X, [pr] = Y, [ps] = Z, and the three segments
pq> PV> PS are pairwise perpendicular, (/?) qprt and tpsu are two right
quadrangles, and (y) w is the perpendicular projection of the point u upon
40
WANDA SZMIELEW
the straight line L which
passes through the point r, is
perpendicular to the straight
line pr, and lies in the plane
prs. Then \pf\ == X $ Y and
[pu] = (X$Y)®Z. Further-
more qpwu proves to be a
right quadrangle ; using this
fact rpsw is shown to be a
right quadrangle as well.
Hence [pw] = Y Q Z and
[pu] = X $ (Y ® Z). Conse-
quently (X®Y)®Z = X&
(Y & Z), what was to be
proved.
To derive Postulate (vi)
(the associative law for 0)
let X, y, Z 6 S and let p,
q, r, s be four distinct points
(Figure 9) satisfying the
conditions: (d) [pq] = X,
[qr] = y, [rs] = Z, and
(e) <£ pqr, < prs, <£ grs are
three right angles. Then
= X 0 y, [?s] = y 0 Z, [£s] = (X 0 y) © Z, and <£ ^s is a right
angle. Hence [ps] = X 0 (y 0 Z) and consequently (X 0 y) 0 Z ==
0 Z). 4 The proof of Lemma 2. 1 has thus been completed.
Fig. 9
By the next two Lemmas the discussion of the properties of the oper-
ations • and + reduces to that of the properties of the operations 0 and $
respectively.
LEMMA 2.2. The function C maps the system <*S, •, :<> isomorphically
onto the system <5, 0, ^>.
This lemma follows directly from Lemma 2.1(x)(xi) and the definition
of-..
LEMMA 2.3. The function CRC maps the system <5, +, •, ^ > isomorphi-
cally onto the system <5, $, •, :<>.
4 The argument used in the proof of (vi) can be found in [2], p. 5.
ELEMENTARY HYPERBOLIC GEOMETRY 41
PROOF. By Lemmas 2.2, 2. 1 (viii) (ix) and the definition of + , the
function CRC maps the system <5, + , — > isomorphically onto the
system <S, ®, ^>. To complete the proof it is sufficient to show that
(1) X-Y = Z if and only if CRCX • CRCY = CRCZ.
From Lemma 2. 1 (v)-(viii) we easily derive the formula
R(X 0 Y) = RX 0 RY,
which, together with the definition of • and Lemma 2.1(x), gives us the
required equivalence (1).
The next lemma provides a new geometrical construction by means of
which the operation • can be obtained. This lemma will eventually lead
to the distributive law for • under -f- ; it also will be helpful in setting up
the foundations of the theory of proportion (see Lemma 2. 1 1 ) .
LEMMA 2.4. Let PpqQ be an improper right triangle. Furthermore, let
r e P and let s be the perpendicular projection of r upon the straight line
pq. Under these assumptions, if [pq] = X and [pr] = Y, then [ps] =
X- Y (Figure 10).
Fig. 10
PROOF. We assume that [pq] = X and [pr] = Y. Consider four points
*i pi> ?i> $1 and three half-lines T, R\, Si (Figure 11) which satisfy the
following conditions: (a) p\ ^ p, the straight line ppi is perpendicular
to the plane pqr, and [ppi] = C Y, and (ft) PpptfiRi, QqptT, and T
42
WANDA SZMIELEW
are three improper right quadrangles. Then
[Mi] = Y, [pt] = CX, [tpd = CXQ CY, \plS{] = X-Y
and
P\\Ri, Q\\T,
CX*CY
since P\\Q, the latter formulas imply 7?i||Si. Furthermore, it is easy to
check that (y) the straight line pp\ is perpendicular to the plane
and hence the angles spr and s\p\r\ are congruent, and that (d) <£
is a right angle. Thus the triangles prs and Pir\si are congruent and,
specifically, segments ps and pi$i are congruent. In conclusion, [ps] =
X-Y.
From Lemma 2.4 we readily derive
LEMMA 2.5. IfX.Ue S, then X-U 9 CX-U = [7.
ELEMENTARY HYPERBOLIC GEOMETRY
43
PROOF. LetX, UeS. We pick an improper right quadrangle Qqpq\Qi
for which [pq] = X (Figure 12). Then [pq{\ = CX. On the half-line P
with origin p and parallel to the half-lines Q and Qi we choose a point r in
such a way that [pr] = U. Let 5 and s\ be perpendicular projections of r
upon the straight lines pq and />^i. Then spsir is a right quadrangle, and
by Lemma 2.4, we have [ps] = X'U and [^sj = CX' U. Hence
X*U ® X'CU = U, which completes the proof.
As an immediate consequence of Lemmas 2.1(i)-(vii), 2.2, 2.3, 2.1(xii)-
(xiii), and 2.5 we obtain the fundamental theorem on the calculus of free
segments.
THEOREM 2.6. For every model 9K of J^n (n^3), the system
@ = <£, +, •, ^> satisfies the following conditions'.
(i) <$, ±S > w a non-empty simply ordered system ;
(ii) */ Z, Y, Z + Y e S, then X + Y = Y + X]
(iii) */ X, Y, Z, X + Y, (X + Y) + Z e S, then Y + Z e S and
44 WANDA SZMIELEW
(iv) if XtZtS, then X < Z if and only if X = Z or else X+Y = Z
for some Y E 5;
(v) if X.YeS, then X- Y e S and X- Y = Y-X]
(vi) ifX, Y,ZeS,then (X-Y)-Z = X-(Y-Z);
(vii) if X,ZeS, then Z :< X if and only if Z = X or else X-Y = Z for
some Y E S.
(viii) if XeS, then there is a Y e 5 such that: (a) X + Y $ S, (ft) if
ZeSandZ <Y, then X + Z e S, and (7) X- U + Y- U = U for
every U e S.
NOTE 2.7. Theorem 2.6 can be extended to the case n = 2.
In fact, Lemmas 2.1(iii), 2. l(vi), and 2.4 are the only ones in proofs of
which three dimensional constructions are involved. These constructions
should now be replaced by two-dimensional ones. Unfortunately, a direct
two-dimensional proof of Lemma 2.4 is still lacking.5 We know, however,
an indirect two-dimensional proof of this lemma ; it is based upon one of
the fundamental results of Section 1, namely the completeness of Jfg
(see Theorem 1 .2). On the other hand, we know two direct two-dimensional
arguments which lead from Lemma 2.4 to Lemmas 2. 1 (iii) and 2. 1 (vi) , res-
pectively. As opposed to the three-dimensional proofs of Lemmas 2.1 (iii)
and 2. 1 (vi) which have a quite elementary character, these two-dimen-
sional arguments are rather involved and refer to deep properties of the
plane. Lack of space prevents us from outlining these constructions.
As a consequence of Theorem 2.6, Note 2.7 and the elementary conti-
nuity axioms, we obtain by purely algebraic argument
THEOREM 2.8. For every model 9K of Jfn (n^2)y the system
g=<5) +t *f :<> can be imbedded in a real closed field % = <F, +, •, ^>
in such a way that S consists of all those elements X E F for which 0 •< X •< 1
(where 0 is the zero element and 1 is the unit element of the field^). In fact, $
is up to isomorphism uniquely determined by @.
The proof of this theorem is easy, though lengthy and laborious.
Postulate (viii) (of Theorem 2.6) plays an essential role in showing that S
is the set of all elements of F between 0 and 1 . While the last part of (viii)
5 See Footnote 6 on page 51.
ELEMENTARY HYPERBOLIC GEOMETRY 45
is a particular case of the distributive law, (viii) plays also an essential
role in the derivation of this law in its general form.
From now on we assume that the field $ involved in Theorem 2.8 has
been fixed and we apply to it the familiar field-theoretical notation. In
particular, the operations + and • are now understood to be performable
on arbitrary elements of the field and not only on free segments.
Theorem 2.8 essentially completes our outline of the calculus of free
segments. We shall need however a few further lemmas of a related
character before we turn, in Theorems 2.15 and 2.16, to the metamathe-
matical discussion of systems $Fn.
LEMMA 2.9. // X, Y, Z e S, then
(i) X 0 Y = Z if and only if CX • CY = CZ;
(ii) X $ Y = Z if and only if X* + Y* = Z2;
(iii) CX = V\ - X2.
PROOF. The equivalence (i) is an immediate consequence of the
definition of • and Lemma 2.1(x).
By (i) and Lemma 2. 1 (viii)-(x) we get
(2) CRCX-CRCX = X.
From the definition of + and formulas (1) (on page 41) and (2) we
easily derive the equivalence (ii).
The formula (iii) follows at once from (ii) and Lemma 2.5.
LEMMA 2. 10. Let pqrs be a right quadrangle and let X — [pq], Y = [qr],
Z = [rs]. Then we have X = CY-Z.
PROOF. Let U = [qs] (Figure 13). Then, in agreement with the defi-
nitions of $ and 0, and by Lemma 2.9, we have
U* = X* + Y2 and 1 — £/2 = (1 — Y2) •(! — Z2).
Comparing these two formulas and applying Lemma 2.9 (iii) we obtain
the conclusion.
LEMMA 2. 1 1 . For i=\,2, let ptftSi be right triangles and let Yi =
and Zi = [piSi\. Under these assumptions, if the angles at the vertices pi
and pz are congruent, then Y\*Z% — Y^Z\.
PROOF. Assume that <£ rip\s\ ^ <£ r^p^s^ and let Pt be the half-line
(Figure 1 4) . The triangle PI^SI determines uniquely a point qt on the
46
WANDA SZMIELEW
half-line piSt and a half-line Qt with origin qt such that PiptqtQt is an
improper right triangle. Then [piqi] = [pzqz]. Putting [p\qi\ = [#2^2] =-X"
and applying Lemma 2.4 we get the formula
which completes the proof.
Fig. 14
LEMMA 2. 1 2. Let pqr be a right triangle, let s be the perpendicular pro-
jections of q upon the straight line pr, and let X = [/>$], Y = [rs], Z =
U ~ [pq], and V — [qr]. Then we have:
(i) X • Z = U • U,
(ii) ^0Z=Y0C70t7
(Figure 15).
ELEMENTARY HYPERBOLIC GEOMETRY
47
PROOF. Formula (i) follows immediately from Lemma 2. 1 1 .
Let W = [qs]. Then
and consequently
thus we arrive at formula (ii).
LEMMA 2.13. Given three distinct points p, s, r for which B(psr), let
X = [ps], Y = [sr], and Z = [pr\. We then have
and
CY
CY -* — -- .-
° 1 + X-Y'-
PROOF. To derive (i) we take a point q in such a way that pqr is a
right triangle and s is the perpendicular projection of q upon the straight
line pr (Figure 15). Let U = [pq]. Then by Lemma 2.12(i), we have
X-Z = U*, i.e.,
(3) 1 - X-Z = (CC7)2,
and, by Lemma 2. 12(11), we get X 0 Z = Y 0 U 0 U, which by Lemma
48 WANDA SZMIELEW
2.9(i) implies
(4) CX-CZ = CY-(CC7)a.
From (3) and (4) we obtain at once the desired formula.
From (i) and the inequality X < Z (which obviously follows from the
hypothesis) we derive (ii) by means of a simple algebraic transformation.
LEMMA 2.14. Given four distinct points p, q, r, s, we have
(i) B(Pqr) if and only if
(ii) D(pqrs) if and only if C[pq] = C[rs].
PROOF. Formula (i) follows Lemma 2.13, in one direction directly,
in the other direction by a simple argument. Formula (ii) is obvious.
The metamathematical discussion begins with the representation
theorem.
Let 55 = <F,+, •,<> be an arbitrary ordered field. By the n-di-
mensional Klein space Stw(3f) over the field $ we understand the system
O4^, Bg, D^y constructed in the following way: A% is the set of all
ordered w-tuples % = <#i, #2, . . . , xny in F X F x ... X 'F (n times) for
which
For any ordered w-tuples x = <#i, #2, • • • , %n> and y = <y\, y<2, • • • , yn> in
n
i = l
(thus — 1 < x-y < 1) and
W(X) y) = 0 -*•*)•(* -yy) ^
We always have Y(x, y) <, 1 . The betweenness relation B% among any
three n- tuples %, y, z in A~ is characterized by the formula
(1 + Vl — W(x,y)'V\~—
The equidistance relation D% among any four n-tuples x, y, z, u in A% is
ELEMENTARY HYPERBOLIC GEOMETRY 49
characterized by the formula
¥(x, y) = W(z, u).
THEOREM 2.15. (REPRESENTATION THEOREM). A system^R = (A,B,Dy
is a model of ^n (n ^> 2) if and only if it is isomorphic with the Klein
space Sw(2f) = O4g, B%, D%> over some real closed field gf.
PROOF. It is well known that the Klein space S8n(9t) over the ordered
field 9ft of real numbers is a model for 3#Jn. Hence, by the result of Tarski
used in the proof of Theorem 1.1, the same applies to all the spaces
®n(W where g is a real closed field, as well as to all isomorphic systems.
To prove the theorem in the opposite direction consider an arbitrary
model 9K = <X, B, £>> of tfn and the correlated system <3=<S, +, •, <>.
By Theorem 2.12, the system @ can be imbedded in a real closed field
% — <F, + , •, ^>, and we can construct the corresponding Klein model
JJn(Qf) = <4cj, B$, Z)s> over the field $. We introduce in this model a
rectangular coordinate system (each of the n coordinates of a point p
being of the form ± U where U E S). It is easy to check, that by corre-
lating with every point p of A the ordered n-tupleXp=<XpL, X%, . ..,Xpy
of its coordinates, we establish a 1-1 correspondence between the
points of A and the' points of A%. (See the definitions of A% and $,
Theorem 2.8 and Lemma 2.9 (ii).) It remains to be shown that this corre-
spondence establishes an isomorphism between 9ft and ftnffi)- This
reduces to showing that the relations B and D among points of A can be
characterized in terms of the coordinates of these points in exactly the
same way in which the relations B% and D<$ among points of A^ have
been defined in the Klein model $n(3f)-
Consider two distinct points p and q in A and the correlated n-tuples
of coordinates Xp and Xq. We first express the free segment [pq] in terms
of Xp and Xq. An easy but lengthy calculation, based exclusively upon
Lemmas 2.9(i), 2.10, and 2.13, leads to
(5) (C[pq\)* = V(X*, X*},
where If is the function used in describing the Klein space. (The argument
is analogous to that used in Euclidean case, with the difference that
rectangles are replaced by right quadrangles.) From (5), Lemma 2.9(iii),
and Lemma 2.14 (i) we conclude at once that the condition
Vi - v(xp, x«) • Vi -
50 WANDA SZMIELEW
is necessary and sufficient for points p, q, r to satisfy the formula B(pqr).
Similarly, from (5) and Lemma 2.14 (ii) we conclude that the condition
lf/(Xpf X*) = W(Xr, Xs)
is necessary and sufficient for points p, q, r, s to satisfy the formula
D(pqrs). Thus the proof is completed.
Using Theorem 2.15 instead of 1.1 we obtain of course a new proof of
Theorem 1 .2 and, actually, we can extend this result to arbitrary dimen-
sion n:
THEOREM 2.16. The theory J^n (n > 2) is complete and decidable but
not finitely axiomatizable.
3. Applications to Related Geometrical Systems. Using the main results
stated in [5] for Euclidean geometry and in this paper for hyperbolic
geometry we shall now establish fundamental metamathematical proper-
ties of elementary absolute geometry. The discussion in [5] has been
restricted to the two-dimensional case only for simplicity of formulation
and the results established there clearly extend to elementary n-di-
mensional Euclidean geometry $n for any n ^ 2.
Let the formalized system stfn of n-dimensional absolute geometry be a
theory which has the same symbolism as & n and 2^n and the axiom
system of which is obtained by omitting Euclid's axiom, A8, in the axiom
system of $n (or the negation of A8 in the axiom system of <#*n). Thus a
sentence is valid in <z/n if and only if it is valid in both £n and Jf' n. As
simple consequence of Theorem 1 in [5] and Theorem 2.15 in the present
paper we obtain.
THEOREM 3.1. $31 is a model of stf n (n ;> 2) if and only if it is isomorphic
either with the Cartesian space (£»(£?) or with the Klein space ®n(%) over
some real closed field $.
Theorem 3.1 contains a description of all models of <stfn which is
however not uniform in its character; the class of models proves to
consist of two widely different subclasses. It would be interesting to
obtain a more homogeneous characterization of this class.
Theorems 2, 3, and 4 in [5] and Theorem 2.16 in the present paper
imply the following theorems 3.2-3.4 as direct corollaries.
THEOREM 3.2. The theory s/n (n ^ 2) has just two complete and con-
sistent extensions, in fact, $n and 3tfn.
ELEMENTARY HYPERBOLIC GEOMETRY 51
A consequence of Theorem 3.2. is that Euclid's axiom can be equiva-
lently replaced in the axiom system of $n by any sentence whatsoever
which is valid in $n but not in Wn\ the same of course applies to the
negation of the Euclid's axiom in the axiom system of J? n.
THEOREM 3.3. The theory 30 n (n ^ 2) is decidable.
This theorem is an improvement of Tarski's decision theorem for $n.
THEOREM 3.4. The theory $4 n (n :> 2) is not finitely axiomatizable.
In conclusion we wish to make some remarks concerning the system
,W n of non-elementary n-dimensional hyperbolic geometry. The main
difference between the symbolisms of J#'n and ,#'n consists primarily
in the fact that all the variables occuring in the former range over points,
while the latter contains also variables ranging over arbitrary point sets.
(The question whether ^n contains in addition variables of higher
orders ranging over families of sets, etc. is irrelevant for the subsequent
remarks.) The axiom system of M n is obtained from that of J^n by re-
placing the infinite collection of elementary continuity axioms by one
non-elementary axiom (see [5], p. 18). In every model 9ft of f^n the
ordered field $ in which the system @ can be imbedded (see Theorem 2.8)
proves to be continuously ordered. Since a continuously ordered field $
is isomorphic with the field 91 of real numbers, the correlated Klein space
Sl»(3f) is isomorphic with the Klein space Slw(9i). Thus, by Theorem 2.15,
we conclude that every model 9ft of ^'n is isomorphic with the Klein
model fttt($R). In this way we arrive at
THEOREM 2.18. The theory J? n is categorical.
This result is well known but all other proofs which are known to the
author are based upon an analytic formula for H(X) (see page 34) and
hence upon some properties of exponential and trigonometric functions. 6
6 While the paper was in press the author noticed that a direct two-dimensional
proof of Lemma 2.4 (cf. Note 2.7 on p. 44) results at once from a theorem due to
Liebmann in [6], p. 191.
Moreover, the author succeeded in constructing in ja/n (n ;> 2) an absolute
calculus of segments. This calculus leads to the representation theorems for both
Euclidean and Bolyai-Lobachevskian geometries.
52 WANDA SZMIELEW
Bibliography
[1] HILBERT, D. Grundlagen der Geometric. 8th ed., Stuttgart 1956.
[2] HJELMSLEV, J. Beitrdge zurNicht-EudoxischenGeometrie I-II. Det. Kgl. Danske
Videnskaberners Selskab, Matematisk-Fysiske Meddelelser, vol. 21 (1944), Nr.
5.
[3] SzAsz, P. Direct introduction of Weierstrass homogeneous coordinates in the
hyperbolic plane, on the basis of the endcalculus of Hilbert. This volume, pp. 97-
113.
[4] TARSKI, A. A decision method for elementary algebra and geometry. 2nd ed.,
Berkeley and Los Angeles 1 95 1 .
[5] What is elementary geometry? This volume, pp. 16-29.
[6] LIEBMANN, H., Elementargeometrischer Beweis der Parallelenkonstruktion und
neue Begriindung der trigonometrischen Formeln der hyperbolischen Geometrie,
Mathematische Annalen, vol. 61 (1905), pp. 185-199.
Symposium on the Axiomatic Method
DIMENSION IN ELEMENTARY EUCLIDEAN GEOMETRY 1
DANA SCOTT
Princeton University, Princeton, New Jersey, U.S.A.
Introduction. It has been well over one hundred years since higher
dimensional geometry made its appearance in mathematics and at least
fifty years since the terminology of infinite dimensional spaces came into
general use. No one can deny the enrichment of the subject brought
about by the introduction of these notions, but even though the infinite
dimensional spaces would seem to be a direct generalization of the finite
dimensional spaces, it is clear that their importance in mathematics really
lies in a different direction. In finite dimensions we are concerned with
ever more complicated configurations of points, lines, planes, spheres, or
other algebraic varieties, and in this study a knowledge of facts in higher
dimensions often leads to a better understanding of the lower dimensions.
Of course, all such configurations are possible in an infinite dimensional
space, but in the study of any one particular problem so little of the space
is used that one might as well work in only a finite number of dimensions.
Thus, the question arises whether there is really anything new in infinite
dimensional geometry. The applications of infinite dimensional geometry
to analysis and the study of function spaces are something new and
beyond the finite dimensional theory, but this is not what is meant. From
the standpoint of pure geometry, is there anything new? In particular,
are there different kinds of infinite dimensional spaces ? Of course, anyone
can think of two distinct Hilbert spaces, for example, but is there any
geometrical property that distinguishes them? Cardinality is a property
that will often distinguish between two infinite dimensional spaces;
however, the cardinal number of a set is not really associated with the
internal structure of the space in isolation but only becomes meaningful
in comparisons with other sets. Thus, in the study of geometrical proper-
ties of spaces we wish to restrict attention to those constructions that can
be carried out within the space itself making use of only the given geo-
metrical notions. This point of view seems even to throw doubt on
1 The results of this paper represent a portion of a thesis submitted to the
Faculty of Princeton University in partial fulfillment of the requirements for the
degree of Doctor of Philosophy.
53
54 DANA SCOTT
topological questions. The most useful facts of point-set topology nearly
always rest on operations performed on arbitrary subsets of the space, and
since the time of Cantor we have realized how vastly complicated these
subsets may become. Indeed, point-set topology with its heavy use of
infinite combinations and infinite repetitions of operations, though
derived from geometrical intuition, is a totally new discipline that has
moved far from the special world of Euclidean geometry. The same may
be said for other questions of the analysis of Hilbert spaces such as
completeness, existence of orthonormal bases, and the like. Thus, we may
be led to the conclusion that the usual geometrical notions do not
involve infinite sets or infinite sequences of points, and this will be the
convention adapted in the present paper. If the reader does not entirely
agree with this point of view, at least it is hoped that he will agree that
elementary geometrical notions do not involve infinite sets and that he
admits that even the infinite dimensional spaces contain the material for
many elementary constructions, so that there is a meaningful question
whether infinite dimensional spaces can be distinguished by their ele-
mentary properties.
First of all it must be said what Euclidean spaces, finite or infinite
dimensional, actually are. The definition chosen in Section 1 is the
standard one making use of vector spaces. Geometrical properties
of spaces must be formulated in terms of geometrically meaningful
notions. In the case of elementary properties there is no loss of gener-
ality in considering only finitary relations between points, and in this
context the term geometrically meaningful relation or simply geometri-
cal relation is given a precise definition. Finally elementary geometrical
properties are identified with those properties of a space expressible in
sentences of the first-order predicate logic in terms of the geometrical
relations over the space. Before giving the specifically geometric results,
a general theorem in the theory of models of the first-order logic is
presented in Section 2. The general result is then applied in a straight
forward way to geometry in Section 3, and it is shown that there are no
elementary geometrical properties distinguishing any two infinite dimensional
Euclidean spaces. In particular, for any given formal property, a very
simple method is given for calculating a finite dimension, m say, such
that the property is true in spaces of dimension m if and only if it is true
in all higher dimensions including all the infinite dimensions. The con-
sequences of this state of affairs for a certain formal theory of geometry
are indicated in Section 4.
DIMENSION IN ELEMENTARY EUCLIDEAN GEOMETRY 55
The author would like to thank Professor Tarski, who originally
proposed the problem and who made many helpful comments on the
formulation of the results.
1. Euclidean Spaces and Geometrical Relations. Before discussing any
formal theory, it is necessary to determine the standard domains of
discourse to which the theory will be applied. As regards geometry, if we
were concerned only with finite dimensions, we could think simply of the
ordinary w-dimensional cartesian spaces whose points are w-tuples of real
numbers. But these are not sufficient for our purposes. In any case, there
is no need to think of a particular coordinate system as in the cartesian
spaces, because a distinguished coordinate system is not a purely geo-
metrical notion. A definition by vector space methods solves the problem
and eliminates any distinguished set of coordinates. In the first place, a
(standard) Euclidean space will be a vector space over the field of real
numbers having any finite or infinite linear dimension. In addition, to
give the essential Euclidean character to the space, a notion sufficient for
questions of distance and perpendicularity has to be supplied. A positive
definite inner product on the space will do just that. To sum up, ^Euclidean
space is a 4-tuple <K, + ,-,•>, where V is a set of elements called the
points of the space ; <F, + > is an abelian group ; • is an operation from the
cartesian product of the real numbers with V to the set V satisfying the
following properties for all reals a, /? and all x, y e V:
(i) !•* = *;
(ii) *-(fl-x) = («£)•*;
(iii) (« + £)•* = («•*) + (£•*);
(iv) a-(* + y) = («•*) + (a-y);
and finally • is an operation from pairs of elements of V to real numbers
such that for all reals a, ft and all x, y , z e V :
(v) Ifx^Q, thenx-x > 0;
(vi) x*y — yx',
(vii) ((a-*) + (p-y))-z = *(x-z) + 0(y-*);
where the symbol 0 denotes in the hypothesis of (v) the zero element of
the group.
German capital letters 35 and 28 will be used to denote Euclidean
56 DANA SCOTT
spaces, and the corresponding set of points will be denoted by Roman
capitals V and W.
In a Euclidean space the distance between points x and y, in symbols
\\x — y\\, can be introduced by definition:
where (x — y) on the right hand side means (x + ((— 1) *y)).
The above treatment of Euclidean spaces, though it does not involve
a choice of a particular coordinate system, does involve using a dis-
tinguished point: the origin or zero vector 0. The dependence on 0 is
eliminated in our definitions of the notions of subspace and isometry.
A subspace of a Euclidean space 33 is a non-empty subset X of V such
that whenever x and y are points in X, then OL-X + (1 — a) -y is in X for
all real numbers a. In other words, if a subspace contains two points of a
line, then it must contain all points of the line. An isometry from a sub-
space X of a Euclidean space 33 onto a subspace Y of a Euclidean space 28
is a one-one function / from X onto Y preserving the distance between
points; that is, if x and y are in X, then \\x — y\\ = \\f(x) — f(y)\\, where
the distance on the left hand side of the formula refers to the space 95,
and, on the right hand side, to the space 955.
In this terminology, subspaces of a Euclidean space are not again
Euclidean spaces since they are not necessarily vector subspaces. If a
subspace contains the zero vector, then it is a vector subspace. It is
thus obvious that a translation of the space will always carry any sub-
space onto a vector subspace, and hence we can say that every subspace
of a Euclidean space is isometric with a Euclidean space. Clearly, iso-
metries between Euclidean spaces always preserve dimension, and so
every subspace of a Euclidean space has an unambiguous dimension.
Having thus defined the dimension of a subspace, a simple property
of subspaces that is needed in the later work can be stated :
LEMMA 1.1. Every set of m + 1 points of a Euclidean space is con-
tained in a subspace of dimension at most m.
Somewhat more complicated but very easy to prove is the following :
LEMMA 1.2. Let X and Y be two subspaces of a Euclidean space 33
having the same finite dimension. Then there is an isometry of 33 onto itself,
mapping X onto Y, and leaving the intersection X r» Y pointwise fixed.
DIMENSION IN ELEMENTARY EUCLIDEAN GEOMETRY 57
The question we turn to next is that of defining the concept of a geo-
metrically meaningful relation between points. There are two different
aspects to the question. First, we can consider a fixed Euclidean space and
relations between the points of that one space. Second, the class of all
Euclidean spaces can be considered, and relations in different spaces can
be compared. For the first problem the answer is very simple: In a
Euclidean space 33, the geometrical relations over 33 are just those re-
lations between points of V invariant under the group of all isometrics of
33 onto itself. In other words, an n-ary relation R over V, or a subset R of
the cartesian power Vn, is a geometrical relation if for all isometrics / of 33
onto itself and all n-tuples <#o, • - - > x>n-\> £ Vn we have <#o» • • • , #n-i>^R
if and only if </(#o), . . ., f(xn-i)> e R. The above definition, of course,
agrees with the well-known program of Klein which asserts that the
group of motions of the space should determine the geometry.
When we pass over to the class of all Euclidean spaces some care must
be taken. There are indeed some set-theoretical problems connected with
the idea of the class of all spaces. These problems do not cause any
essential difficulties and can all be solved by adopting a standard type of
formal set-theoretical framework. Rather more important here is the
question of comparison of different spaces. A geometrical relation over
the class of all Euclidean spaces should be an assignment of one geometri-
cal relation to each particular space. Clearly isometric spaces should get
isometric relations; but more than this, the assignment should be in-
sensitive to dimension. It would be hopeless to try to classify all ways of
assigning one kind of relation to one-dimensional spaces, another kind to
two-dimensional spaces, a third to three dimensions and so on. So we are
lead to the restriction that a geometrical relation over all spaces is to be
invariant under isometrics not only from one space onto another, but also
from one space into another. In more formal terms: an n-ary geometrical
relation over the class of all Euclidean spaces is a function R that assigns
to each space 33 a subset R<% of Vn such that if / is an isometry of 33 into
a space 28, then for all w-tuples <#o, • • ., #n-i> e Vn we have <#o, • • •,
Xn-iyeR^ if and only if </(#o), • • >> f(xn-i)> eT?^. The effect of the
above definition is to assure that if X is a subspace of a Euclidean space
33, then the relation R^ r\ Xn, the restriction to X, is isometric to the
relation obtained by considering X a Euclidean space in itself.
There are many examples of geometrical relations. The first interesting
case is that of binary relations. It can be shown that there are 22**° geo-
metrical binary relations; in fact, they can all be obtained in the following
58 DANA SCOTT
way: Let D be any set of non-negative real numbers. For each space SJ,
let the relation R^ be defined by the condition <X y> e R^ if and only if
||# — y|| e D, for all x, y e V. Then 7? is a geometrical relation. For
ternary relations we need mention only a few: betweenness, being the
midpoint, collinearity, forming an equilateral triangle, being equidistant
from two points, and so on. A similar description of all such relations can
easily be given in terms of sets of triples of real numbers.
Finally it is to be noted that the definition can be extended to cover
geometrical relations between lines, planes, spheres, and the like, but this
is hardly ever necessary and will not be considered here. In view of the
fact that the various algebraic loci are completely determined by a finite
number of points lying on them, any relation between such objects can be
encoded into an equally powerful relation between points. For example, a
binary relation between lines can always be replaced by a quaternary
relation between points. Also there is no need to consider more than one
geometrical relation between points, since two relations, one n-ary and
one w-ary say, can always be replaced by a single (n + w)-ary relation in
an obvious way.
2. Arithmetical Extensions of Finite Degree. All terminology of the
paper of Tarski and Vaught [2] will be adopted for the purposes of this
section, except that relational systems (A, Ry are considered where R is
n-ary relation, or relation of rank n, rather than just a ternary relation.
The integer n, however, is to be fixed for the discussion. The formal
theory T in the first-order predicate logic must then contain an n-placed
predicate symbol P, as well as the standard logical symbols.
In particular, we are interested in the specific algebraic condition given
by Tarski and Vaught [2] in Theorem 3.1 for one relational system to be
an arithmetical extension of another. The general notion of arithmetical
extension defined in that paper concerns all possible formulas of the first-
order logic, and for the purposes of this paper a weaker notion involving
only a restricted class of formulas is needed. The formal definition
follows.
DEFINITION 2.1. The system @ = <B, S> is called an w-degrce arith-
metical extension of the system 9t — O4, Ry if the following two conditions
are satisfied:
(i) @ is an extension of 91;
DIMENSION IN ELEMENTARY EUCLIDEAN GEOMETRY 59
(ii) for every formula $ containing at most m distinct variables and every
sequence x e A <">, x satisfies <f> in 9ft if and only if x satisfies <f> in @.
It should be noted that there is no loss of generality in condition (ii) if
the formula <f> is required to contain only variables from the specific list
VQ, vi, . . . , vm~i, and then the sequence x can be chosen simply from the
set Am.
A generalization of Theorems 3. 1 of [2] can now be stated and proved.
THEOREM 2.2. The following two conditions are (jointly) sufficient for a
system © = <#, S> to be an m-degree arithmetical extension of a system
JR= <A,R>:
(i) @ is an extension of 9ft;
(ii) for any subset A' of A with less than m elements and any element b of B,
there exists an automorphism f of & such that f leaves A' pointwise fixed
andf(b)eA.
PROOF. Using the remark following Definition 2.1 we restrict attention
to formulas involving at most the variables VQ, v\, . . ., vm-\ and proceed
by induction on the length of formulas. Assuming conditions (i) and (ii)
above, the following is the statement to be proved for formulas <f> of the
restricted type:
(*) for all x G Am, x satisfies </> in 9i if and only if x satisfies cj> in @.
The statement (*) is obviously true for atomic formulas, and it is very
easy to show that if (*) holds for formulas <f> and y then it holds for -. <f>
and (f> A yj. Suppose now that (*) holds for (f), and consider the formula
V v#f>, where k < m. Assume first that x e Am and x satisfies V v&/> in 9t.
Then for some element a e A, x(k/a) satisfies <f> in 9t. By the hypothesis
x(k/a) satisfies <j> in @, and hence x satisfies V vj^ in @, as was to be shown.
Assume now that x e Am and x satisfies V vj^ in ©. Let b e B such that
x(k/b) satisfies <f> in @. The set A1 = {xt\i < m, i ^ k} is a subset of A
with fewer than m elements. In view of condition (ii), let / be an auto-
morphism of @ leaving A ' pointwise fixed and with f(b) E A . Since / is an
automorphism of <3, the sequence </(#o), • . . , /(#*-i), f(b), . . . , f(xm-i)> =
x(k/f(b)) must satisfy </> in @. Hence, from (*) for <f>, we have that x(k/f(b))
satisfies <f> in 9ft, and finally x satisfies V v^ in 9ft, which completes the
proof that (*) holds for V v#f>. Thus, by induction (*) is established for all
formulas and the theorem is proved.
60 DANA SCOTT
In the original version of this paper, the author proved a somewhat
different form of Theorem 2.2 which does not require the existence of
automorphisms of the system @ but rather uses a whole class of iso-
morphic subsystems 35 of © whose union covers the set B. However,
Euclidean spaces possesses so many isometrics, as was noted in Lemma
1 .2 of Section 1 , that the simpler theorem just given is quite adequate for
the results of the next section. The other version of the general algebraic
condition for w-degree extensions and its applications will be published
elsewhere. Notice that Theorem 2.2 implies Theorem 3.1 of Tarski-
Vaught [2], since being an arithmetical extension is equivalent to being
an w-degree extension for each m, and the condition (ii) of their Theorem
3. 1 obviously implies conditions (ii) of Theorem 2. 1 above.
3. Relational Systems Derived from Euclidean Spaces. Let S3 be a
Euclidean space and let R be an n-ary geometrical relation over 93. The
system <F, Ry is a relational system and any subspace X of 93 yields a
corresponding subsystem <X, R r» Xny. Our first theorem shows the
relation of the theory of first-order sentences true of R in the whole space 93
to those true in the subspace X.
THEOREM 3. 1 . If R is an n-ary geometrical relation over the Euclidean
space SJ and X is a subspace of 93 of dimension at least m, then the rela-
tional system <K, Ry is an (m + \) -degree arithmetical extension of
<X, R o X»>.
PROOF. We need only verify condition (ii) of Theorem 2.2. Let X' be
a subset of X with at most m elements. Since we may obviously assume
m > 0, X' can be contained in a subspace Y of dimension m — 1 which is
also contained in X. Let YO be a subspace of dimension exactly m con-
taining Y and contained in X. Let b be any point in V not in X. Obviously
we can find a subspace YI of dimension exactly m containing Y and con-
taining b. Since YQ is included in X and b is not, YO ^ YI = Y. Using
Lemma 1 .2, let / be an isometry of 93 onto itself taking YI onto YO and
leaving Y pointwise fixed. The function / will thus be an automorphism of
<F, Ry such that f(b) e X, which completes the proof.
COROLLARY 3.2. // R is an n-ary geometrical relation over the Euclidean
space 93 and X is an infinite dimensional subspace of 93, then the relational
system <F, Ry is an arithmetical extension of (X, R r» Xny.
DIMENSION IN ELEMENTARY EUCLIDEAN GEOMETRY 61
COROLLARY 3.3. // R is an n-ary geometrical relation over the Euclidean
space 95 and X and Y are two infinite dimensional sub spaces of 95, then the
relational systems <X, R r\ Xny and <Y, R r» Yw> are arithmetically
equivalent.
In less formal terms the above results can be explained in the following
way. Let 95 be an Euclidean space and R be a geometrical relation. Consider
a sentence <j> in the formal first-order theory of a predicate that is to be
interpreted as the relation R. We ask whether <f> expresses a true property
of R. Now </> contains only finitely many symbols and in particular only
a finite number of variables, m + 1 say. Theorem 3. 1 shows us that the
truth of <j> can be established by looking not at the whole space, but only
at w-dimensional subspaces of 95. If 95 were already of a dimension
smaller than m, this result is not of much help. However, if 95 has a very
large dimension or is infinite dimensional, then the reduction is consider-
able. In particular, Corollary 3.3 shows that no single first-order property
or even a set of first-order properties of the geometrical relation R can
ever distinguish between two infinite dimensional subspaces of 95.
We turn now from one space to the class of all spaces. Here we need to
consider geometrical relations over the whole class of spaces as defined
in Section 1 . As a direct consequence of Theorem 3. 1 we obtain :
THEOREM 3.4. // R is a geometrical relation over the class of all Euclidean
spaces and <f> is a sentence of first-order logic with (m + 1 ) distinct variables,
then <f> is true in all relational systems <F, R<^>, where 95 is a Euclidean
space of dimension at least m, if and only if <f> is true in at least one such
relational system.
COROLLARY 3.5. // R is a geometrical relation over the class of all
Euclidean spaces and 95 and 28 are infinite dimensional Euclidean spaces,
then the relational systems <F, R^y and (W, R^y are arithmetically
equivalent.
The argument that leads to 3.5 can be extended to show that there is
no collection of geometrical relations and no collection of their first-order
properties that can distinguish between any two infinite dimensional
Euclidean spaces. It should be clear from the proof given above that if we
only wanted this result about infinite dimensional spaces, it would be
possible to use Theorem 3.1 of Tarski-Vaught [2] directly without going
through the generalization of that method given in Section 2. However,
62 DANA SCOTT
the relation between truth in the whole space and truth in its finite
dimensional subspaces as developed here in Theorem 3.4 leads to an even
stronger result about infinite dimensional geometry as is explained in the
next section. Furthermore, it allows us to establish a criterion for de-
termining whether a relation defined in first-order logic in terms of a
given geometrical relation is again such a relation. This criterion is pre-
sented in Theorem 3.6 below.
First it must be made clear when one relation is definable (in first-
order logic) in terms of another relation. Let R be a geometrical relation
and let 99 be a formula in the first order theory of R whose free variables
are all contained in the list VQ, vi, . . . , vp-i. We can easily think of <p as
defining a new />-ary relation, S say, in terms of R. Of course this definition
must be made relative to each Euclidean space separately, and so 5 must
be thought of as a function from spaces SB to subsets S^ of V*. Finally in
precise terms we say that S is defined by y in terms of R if for each
Euclidean space 3$ and for all sequences x e V^\ we have <#o> . . . , xp-\>
E S<% if and only if x satisfies qy in <F, R^>. Not every formula <p leads to a
geometrical relation S, however. To see this, let qp contain freely none of
the variables VQ, . . . , vp~i, and choose the formula in such a way that it
expresses a property of spaces true in only one dimension, then S will not
be geometrical. The test that S" must pass to be a geometrical relation is
given next.
THEOREM 3.6. Let the p-ary relation S be defined by the formula <p in
terms of the geometrical relation R. Suppose further that the total number of
variables in q>, including the free variables, is m + 1 . Then S is a geometrical
relation if and only if for all Euclidean spaces SB and 28 of dimension at most
m and all isometrics f of 83 into 28, we have for all sequences x e F(co),
<XQ, . . . ,xp-i> e 5,^ if and only if <f(xQ), . . ., f(xp-i)y e S^.
PROOF. Obviously, if 5 is a geometrical relation, then it satisfies the
condition given for isometrics. Suppose then that 5 is not a geometrical
relation. Thus, there must be Euclidean spaces 2$ and 28 and an isometry
/ from SB into 28 and a sequence x e V^ such that the formulas
<*o, • . • , Xp-i> E S^ and </(#o), •-, /(*u-i)> e 5TO are not equivalent. By
the symmetry of the situation we need treat only the case where
<XQ, • • • , *p-i> e S<g and </(*o), • • • > /(*n-i)> £ S&- Since <£ defines S, we
conclude that x satisfies </> in <F, R^y and the sequence </(#o), /(#i)» • • • >
does not satisfy <f> in <W, R$&>- -Due to the fact that <£ has only free vari-
ables in the set {VQ, . . . , vp-i} we can assume without loss of generality
DIMENSION IN ELEMENTARY EUCLIDEAN GEOMETRY 63
that [xi\i < co} = {XQ, . . . , %-i}. Now by hypothesis p < m + I, and so
there exists a subspace X of 93 of dimension at most m containing the set
{XQ, . . . , Xp-i} ; in particular, if 33 is of dimension at most m, we shall
assume X = V, and otherwise that X is of dimension exactly m. In any
case we can conclude with the aid of Theorem 3.1 that the sequence x
satisfies </> in <X, R^ r> Xvy. The image of X under / is a subspace X' of
28 of dimension equal to that of X. Let Y be a subspace of 28 that is
either equal to W in case 28 is of dimension less than m or is of dimension
exactly m, and which in any case contains X' . By the same "argument as
above the sequence </(#o), f(xi)> • • • > does not satisfy <£ in <Y, /^ r» Y#>.
Now the two subspaces X of SS and Y of 26 are themselves isometric with
Euclidean spaces 25' and 833' of dimensions at most m by isometrics g and A
where g is from F' onto X and A is from Y onto W. Let /' = hfg, which
will be an isometry from 2$' into 28'. By our very construction we can
obviously conclude that the sequence <g~l(xo), g~l(xi), . . .y satisfies
cf> in <F', 7?sjv> and hence ^^(^o), • • . , g~l(xp~i)> e S^* while <A/(*0), . . . ,
hf(xj)-i)y $ S^,. This finally shows that S does not satisfy the condition
of the theorem, which completes the proof.
4. Axiomatic Geometry. In his paper [3], Tarski presents a particularly
neat axiomatic system for two-dimensional Euclidean geometry in terms
of the basic notions of betweenness and equidistance. As is indicated in [3] an
axiomatization for any finite dimension can be obtained by a simple
change in two of the axioms. All these axiomatic theories are decidable,
and it follows from the method in Tarski's monograph [1] that there is
even a uniform method for deciding for each integer m whether an ele-
mentary sentence in terms of betweenness and equidistance is true in
Euclidean spaces of dimension m. It is to be shown here that there is also
an effective decision method for the class of sentences true in infinite
dimensional spaces.
Let B and E be respectively ternary and quaternary geometrical
relations denoting the betweenness and equidistance relations in Euclidean
spaces. The first-order theory, then, must contain a ternary and a qua-
ternary predicate symbol. Let $m, m < co, be the class of all sentences of
this first-order theory true in the relational systems <F, B^, E^y where 25
is an w-dimensional Euclidean space. Since all w-dimensional spaces are
isometric, the theory $m is complete. In this section we shall often use the
word theory to mean any class of sentences of the first-order logic that is
consistent and is closed under all the usual rules of deduction; while a
64 DANA SCOTT
complete theory is a maximal such class. Let $ = 0 $m be the common
part of all these theories, that is, the class of sentences true in all finite
dimensions. $ is, of course, not a complete theory, but it is a decidable
theory as will be shown below. One further theory will be considered,
namely &*> = U fl &n, that is, the class of sentences true in all but a
finite number of dimensions. ^ is a theory since it is the union of an
increasing sequence of theories, but what is surprising is that ^ is a
complete theory and, in fact, is the class of sentences true in all infinite
dimensions. We turn now to the systematic account of these results.
LEMMA 4. 1 . <?m ~ U £n ^ 0
n-fm
PROOF. In words : there is a sentence true in the dimension m but not
true in any other dimension. To demonstrate this one has only to trans-
late into formal logical symbols the sentence that expresses the fact that
there exists a configuration of m + 1 distinct and mutually equidistant
points, but no such configuration with m + 2 points. Notice that the
trivial dimension m = 0 is accomodated quite nicely.
LEMMA 4.2. // Am is any sentence in the set $m ~ U $n, then
REMARK. The symbol cl(2£] denotes the closure of the set of sentences
#" under the rules of deduction of the first-order predicate logic. Thus 4.2
expresses the fact that the theory $m results from the theory $ by the
addition of any single axiom chosen as indicated in the hypothesis of the
lemma.
PROOF. Assuming the hypothesis, let <f> be any sentence in $m and
consider the implication \Am ~> </>] = -\[Am A -, </>]. Clearly [Am-*fi] e <^m
since </>E(^m. But also, for any n =£ m, Am$$n and so ^Amt$n\
hence, [Am -><f>] e $n. It follows at once that \Am -* </>] e $. This argu-
ment shows that 6"m C cl($ w {Am}). The obviousness of the opposite
inclusion completes the proof.
THEOREM 4.3. The only finite complete extensions of the theory $ are
the theories ^m, m < o>.
PROOF. That each complete theory <£m is a finite extension of $ is the
content of Lemmas 4. 1 and 4.2. Assume then that $* is a finite complete
extension of £ with $* ^ $m for all m < co. Let A* be the single axiom
DIMENSION IN ELEMENTARY EUCLIDEAN GEOMETRY 65
needed to have <f# = d($ w {/!*}). Since &*^€m, it follows that
^* t $m> Thus, -« A* e <^w, for all m < at, which implies -i A* e <f . Thus,
the theory <f * would have to be inconsistent, which is impossible.
LEMMA 4.4. // the sentences Am are chosen as in Lemma 4.2, then
co}).
PROOF, -i Am e fl ffn by construction, and hence -, Am E S^ for each
W> TO
m < co. Thus, cl($ w {-i Am\m < co}) C $^. Let </> be any sentence in &^.
There exists an integer m such that <f> e fl $n- Consider the implication
n>
[[-i ZI0 A -. Ji A ... A-I Am-i] -><£). It is easy to see that this sentence
is in $ and hence 0 e c/{<f v (Am\m < co}). The converse inclusion is thus
established.
LEMMA 4.5. ^ zs the set of sentences true in all infinite dimensional
spaces and is complete.
PROOF. This lemma is a direct consequence of Theorem 3.4, Corollary
3.5, and the definition of $^.
THEOREM 4.6. There is only one infinite complete extension of the theory
$ and that is the theory $^.
PROOF. That ^ is a complete extension of <f follows from Lemma
4.5. Let ^ be any other infinite complete extension of $. We have
S\ ^ <^m for all m < co. In the notation of Lemma 4.2, Am $ $* for all
m < (o. Hence -. Am e ^ for all m < co. This last implies in view of
Lemma 4.4, that ^ C $^. Since both these theories are complete, we
conclude that ^ = <f *, as was to be shown.
THEOREM 4.7. The theories $ and &^ are decidable.
PROOF. Let <f> be any sentence. Count the number of variables in
<£, say m + 1 . Now 0 6 <f ^ if and only if (f> e <$m by Theorem 3.4. Since
the condition <f> e £m can be decided effectively, we have an effective
decision procedure for &^ Finally, notice that <f> e $ if and only if
<t> e fl $n ', again a condition that can be checked in a finite number of
n<m
steps. The proof is complete.
THEOREM 4.8. For any formula (f> with all free variables in the set
{fc'o, • • • , Vp-i}, it can be decided effectively whether $ defines a p-ary geo-
metrical relation in terms of the geometrical relations B and E.
66 DANA SCOTT
PROOF. The formula <f) will contain only m + 1 variables. According
to Theorem 3.6, we need only check whether </> defines a geometrical
relation with respect to Euclidean spaces of at most dimension m. In fact,
it is sufficient to restrict attention to one Euclidean space 3$ of dimension
exactly m and only consider the identity isometries from Euclidean
subspaces of SB onto themselves. This checking can be carried out by
seeing if the relation defined by <£ and restricted to a subspace is the same
relation obtained by restricting all the free variables in <£ to the subspace
and relativising all the quantifiers in </> to the subspace. But, the predicate
of being in the least subspace spanned by a given number of points is
definable in first-order logic in terms of betweenness and equidistance.
Thus, since the number of points needed for specifying a subspace is at
most m + 1 , we can translate the question of the equivalence of the two
forms of the relation defined by c/> into a single first-order sentence. This
sentence, then, need only be checked for validity in dimension m, a
process that is effective.
This completes the formal development of the subject, and the author
would like to conclude with some informal remarks. An amusing point to
notice in the arguments of this section is that any sentence Am satisfying
the conditions of Lemma 4.2 must necessarily contain at least m + 2
variables. That the lower bound can actually be attained in the theory of
B and E can be verified by writing out in logical symbols the sentence
given in words in the proof of Lemma 4. 1 .
The consequence of these results for the problem of axiomatizing
Euclidean geometry is that the theory $ is the only one of these theories
that need be axiomatized, for we have shown above that one may pass
from $ to $m simply by the adjunction of the sentences Am. Though all
details have not been completely checked by the author, it would seem that
an adequate axiomatization of the theory 6a would result by dropping
axioms Al 1 and A12 of the system given by Tarski in [3]. Finally, the
simplest way of axiomatizing infinite dimensional geometry would be to
add to & an infinite list of sentences expressing the fact that any number
of mutually equidistant points can be found.
This last remark about infinite dimensional geometry indicates an
immediate difference between the first-order formalism and theories
permitting quantification over arbitrary finite sets as explained for the
theory $2' in [3], For it is seen at once that the infinite dimensional
character of a space can be expressed in a single sentence involving
DIMENSION IN ELEMENTARY EUCLIDEAN GEOMETRY 67
variables ranging over arbitrary finite sets, a fact clearly not true in the
first-order theories in view of Lemma 4.5. Further, there seems to be no
hope of giving a simple syntactical method like the counting of the
number of variables for showing the relation of the truth of a sentence in
one dimension to the truth in another dimension in these extended
theories as was done in the fundamental result for our investigation
Theorem 3.4. However, Tarski has noticed that Corollary 3.5 about
infinite dimensional Euclidean spaces still holds for properties of relations
formulated in the extended theory with finite sets, because the result in
Theorem 3. 1 of [2] remains valid in this generalization. Hence, even from
this broader view, there is no way to distinguish between infinite dimen-
sional Euclidean spaces.
Bibliography
[1] TARSKI, A., A decision method for elementary algebra and geometry. Second
edition, Berkeley and Los Angeles 1951, VI + 63 pp.
[2] and VAUGHT, R. L., Arithmetical extensions of relational systems. Compo-
sitio Mathematica, vol. 13 (1957), pp. 81-102.
[3] What is elementary geometry ? This volume, pp. 16—29.
Symposium on the Axiomatic Method
BINARY RELATIONS AS PRIMITIVE NOTIONS
IN ELEMENTARY GEOMETRY
RAPHAEL M. ROBINSON
University of California, Berkeley, California, U.S.A.
1. Introduction. We shall consider equidistance and the order of points
on a line as the standard primitive notions of Euclidean, hyperbolic, or
elliptic geometry. Here equidistance is a quaternary relation, whereas
the order of points on a line is described in Euclidean or hyperbolic
geometry by the ternary relation of betweenness, and in elliptic geometry
by the quaternary relation of cyclic order. Various axiom systems have
been given in terms of these primitive notions; see, for example, Tarski
[7] for the Euclidean case. The adequacy of other proposed primitive
notions for geometry will be judged by comparison with the standard ones.
M. Fieri [4] has shown that a ternary relation, that of a point being equal-
ly distant from two other points, can be used as the only primitive notion
of Euclidean geometry of two or more dimensions. Indeed, in terms of this
relation, it is possible to define equidistance of points in general, and the
order of points on a line. The same ternary relation is also a possible
primitive notion for either of the non-Euclidean geometries, hyperbolic or
elliptic. A detailed discussion of Pieri's relation is given in Section 2.
We may raise the question whether one or more binary relations might
serve as the primitive notions in some of the geometries. This is im-
possible in Euclidean geometry as described above, since the primitive
notions of equidistance and order are preserved by similarity transfor-
mations, and no non-trivial binary relation is so preserved. However, let
us choose a unit distance in the Euclidean space, and regard the property
of two points being a unit distance apart as a new primitive notion. Then
only isometric transformations preserve the primitive notions. The prob-
lem concerning the possibility of using just binary relations as primitive
notions is thus reinstated.
We shall suppose at all times that a ^-dimensional Euclidean, hyper-
bolic, or elliptic space is under discussion. x The exact value of p is
usually immaterial, except that we shall always suppose that p ^ 2;
1 Many of the results stated for elliptic geometry apply also to spherical geome-
try, but we shall not go into this.
68
BINARY RELATIONS AS PRIMITIVE NOTIONS 69
this condition will be understood henceforth without explicit mention. (The
one-dimensional case will be excluded because the results there are usually
exceptional and rather trivial.) Only the standard case where the base
field is the field of real numbers will be considered. Thus, for example,
the Euclidean ^-space will be regarded as the direct pth power of the field
of real numbers. The points of the space will be denoted by A, B, C, • • • ,
X, Y, Z, or by these letters with subscripts or superscripts. (In contrast
to this, the letters a, b, c, - • - , x, y, z will be used for real numbers, with
i> J, k, • - -, p, q, Y reserved for natural numbers).
The space will be regarded as a metric space, the distance from A to B
being denoted by AB. Thus the symbol AB always denotes a non-
negative real number (unless it occurs as part of a formula, such as
AB J_ CD, which is defined as a whole). In the Euclidean case, the dis-
tance A B is the square root of the sum of the squares of the differences of
the coordinates of A and B. In the non-Euclidean cases, the metric will be
assumed to be chosen so that the natural unit of length is being used.
The definability of a notion in terms of given notions will always be
understood in this paper as elementary (— arithmetical) definability. 2
That is, aside from the given notions, a definition will use only the con-
cepts of elementary logic, and the only variables used will be A , B, C, • • • ,
which range over points of the given space. The following logical symbols
will be used: A (and), v (or), -, (not), -> (if • • • then •••),«-> (if and only
if), A (for every), and V (there exists). Identity (between points of the
given space) will also be regarded as a logical concept. In addition, we
shall sometimes use equations such as AB = CD. Here the entire equation
may be regarded as a convenient notation for the quaternary relation of
equidistance between points.
We return now to the question whether some of the geometries might
be based on primitive notions which are all binary relations. In other
words, are there some binary relations which are definable in terms of the
usual primitive notions, and in terms of which the usual primitive notions
are definable? We shall show that in elliptic geometry, it is possible to
use a single binary relation as the only primitive notion. 3 In particular,
2 Some problems concerning more general types of definability are studied by
Roy den [5].
3 This result was found independently by H. L. Roy den and the author, shortly
after listening to a lecture by Alfred Tarski on the primitive notions of Euclidean
geometry, based in part on Beth and Tarski [1], in the spring of 1956. The binary
relation used by Royden was AB ^ yr/4.
70 RAPHAEL M. ROBINSON
as shown in Section 3, the binary relation AB = n/2, which expresses
that the two points A and B are at a distance n/2 apart (which is the
maximum possible distance in the elliptic space), is a suitable primitive
notion for elliptic geometry.
On the other hand, in Euclidean geometry (with a unit distance given) ,
or in hyperbolic geometry, it is impossible to use binary relations as the
only primitive notions. This is proved in Section 4 for binary relations of
the form AB = d, and in Section 5 for binary relations in general. The
difference between these geometries and elliptic geometry is due mainly
to the fact that the elliptic space is bounded. In fact, as shown in Section 6,
the local properties of Euclidean and hyperbolic spaces are expressible
in terms of a binary relation, that of two points being at a prescribed
distance apart.
If d is chosen so that the distance d is definable in terms of the usual
primitive notions, then the system based on the binary relation AB = d
as its only primitive notion is weaker than the standard one so far as
definability of concepts is concerned, but otherwise it is incomparable.
The distance d cannot always be definable, since there are a non-de-
numerable infinity of possible values of d, but only a denumerable
infinity of possible definitions. The problem of determining which
distances d are definable is solved in Section 7.
2. Pieri's ternary relation. As mentioned in Section 1 , Fieri has shown
that in Euclidean geometry, it is possible to define the equidistance
relation AB = CD and betweenness in terms of the ternary relation
AB = BC. His argument is also valid in hyperbolic geometry. We give
below a proof somewhat different than Pieri's, and then show how to
extend it to the elliptic case.
THEOREM 2.1. Pieri's ternary relation AB — BC is a suitable primitive
notion for Euclidean, hyperbolic, or elliptic geometry.
PROOF. Let the symbols bet (A ,B,C), col (A ,B,C), and sym (A , B, C)
express respectively that B is between A and C, that A, B, C are collinear,
and that A and C are symmetric with respect to B (that is, that B is the
midpoint of the segment joining A and C). Then the following definitions
are valid formulas in Euclidean or hyperbolic geometry:
AB ^ BC <-> (A X)[BX = XC -> (V Y)(A Y = YB = BX)],
BINARY RELATIONS AS PRIMITIVE NOTIONS 71
bet (AtB, C) ~£ ^ ,4 A £ ^ C A(A X)[XA ^AB A XC ^CB ->X = B],
col (A, B, C) <-> A = B v A = C v 5 = C
v bet (B, 4, C) v bet (4, B, C) v bet (4, C, #),
sym (4, 5, C) <-> (A -X")[col (A B, X) A 4B - BX <-> X == 4 v X = C],
4# - CD <-> (V A", Y)[sym (/I, X, C) A sym (B, X, Y) A YC = CD].
Hence betweenness and equidistance are definable in terms of Pieri's
relation, as was to be shown.
We shall now extend the result to the elliptic case. 4 Some modifi-
cations of the above definitions are required. The definition of AB ^ BC
is still correct. The validity of the next definition depends on how we
interpret bet (A , B, C) in elliptic geometry. It is correct if we understand
this to mean that there is a unique shortest line segment joining A and C,
and that B is an interior point of this segment. Notice that when AC=7i/2,
as well as when A = C, there is no such point B.
The definition of col (A, B, C) given above is not valid in elliptic
geometry. We shall give a definition below which expresses collinearity
as a special case of cyclic order, which we also need. Once collinearity has
been defined, the previous definitions of sym (A, B, C) and AB = CD
may be used. Notice that the definition of sym (A, B, C) is so formulated
that the relation holds, as it should, when A = C and AB = n/2.
We now wish to define the cyclic order of points on a line. We start by
defining recursively a relation seq (Ao, A\, • • •, An) for n ^ 2, as follows:
seq (Ao, AI, A2) *->bet (Ao, AI, A2),
seq (AQ,Ai, "-.An) <-> seq (A0, AI, --.An-i)
A bet (An-2, An-i, An) *An=£Ao* -.bet (An-i, AO, An).
It is seen that seq (Ao, AI, • • •, An) expresses that the sequence of points
^4o, A i, • - -, An lie in this cyclic order on a line, and divide the line into
intervals such that, excluding the one from An to AO, the sum of the
lengths of any two consecutive intervals is less than n/2. This extraneous
condition concerning the lengths of the intervals may be removed by
4 A reader who is concerned only with Euclidean and hyperbolic geometry may
proceed directly to Section 4.
72 RAPHAEL M. ROBINSON
putting
ord (A0, Ai,---, An) <-> (V X0, Xlt • - •, X4n)[X<> = A0*X4 =
A • • • A Km = An A seq (^T0, -X"l, • '
Then, as is easily seen, ord (A$t A\t • • •, An) expresses simply that the
points AQ, Ai, • • • , An are in this cyclic order on a line. In particular,
ord (A, B, C, D) is the basic quaternary relation of cyclic order in elliptic
geometry. Furthermore,
col (A, B, C) <-* A = B v A = C v B = C v ord (A, B, C),
so that, as previously noted, equidistance is also definable. Thus Fieri 's
ternary relation is a suitable primitive notion for elliptic geometry. (It
may be noticed that the relation seq (Ao, AI, • • -, An), defined above for
all n J> 2, was needed only for n ^ 12.)
3. Binary primitives for elliptic geometry. In elliptic geometry, Pieri's
relation is definable in terms of any distance d with 0 < d < n/2. The
converse holds only if cos d is algebraic. In this case, the binary relation
AB = d is a suitable primitive notion for elliptic geometry. A detailed
proof is given only for d = n/2, which seems to be the most interesting
case, since the condition AB — n/2 expresses that the polar of either of
. the points A or B passes through the other.
We start by noticing two definitions that can be used in the elliptic
plane. The formulas
col (A, B, C) «-» (V X)[AX = BX = CX = n/2]
and
AB±CD~A^B*C=£D*(V X)[AX = BX = n/2 A col (C,D,X)]
define the collinearity of three points and the perpendicularity of two
lines, in terms of the distance n/2. Here, of course, a notation such as
AX = BX = n/2 is short for AX = n/2 A BX = n/2; the concept of
equidistance is not involved.
THEOREM 3. 1 . The following formula holds in the elliptic plane :
BC = CA = AB = n/2 A col (B, P, C) A col (C, Q, A) A col (A, R, B)
A P ^ C A Q ^ C A AP ± QR A BQ _L PR -+ AR = n/4.
BINARY RELATIONS AS PRIMITIVE NOTIONS 73
PROOF. One model of the elliptic plane consists of all lines through
the origin in a three-dimensional Euclidean space. We may identify A,
B, C with the x, y, z axes. Then AP and BQ correspond to planes z — ay
and z = bx, with suitable values of a and b. The planes through P perpen-
dicular to BQ and through Q perpendicular to A P are
x — aby + bz = 0, — abx -f y + az = 0-
These planes intersect the plane z — 0 in lines where x = aby and y=abxt
respectively. These lines coincide only if ab = ± 1. Thus the point R is
represented by one of the lines y = ± x, z = 0.
THEOREM 3.2. The binary relation AB = n/2 is a suitable primitive
notion for elliptic geometry.
PROOF. We can define the distance n]2 in terms of Pieri's relation, by
using the definition of bet (A, B, C) from Section 2, and the formula
AB = n/2 <-> A ^ B A -,(V X) bet (A, X, B).
It remains to define Pieri's relation in terms of the distance n/2.
Consider first elliptic plane geometry. Notice that the distance n,'4 is
definable. Indeed, we see that AR=n/4 if and only if there exist points
B, C, P, Q satisfying the conditions stated in Theorem 3.1.
We now give a scries of further definitions leading to Pieri's relation
AB = BC:
mid (A, B} C) <-> col (A, B, C) A (V X)[AX = CX = n/4 A AC J_ BX],
mex (A, B, C) <-> col (A, B, C) A (V X)[mid (A, X} C) A BX = n/2],
sym (A, Bt C} <--> mid (A, B, C) v mex (A,B,C)v A =B = C
v (A = C A AB = n/2) v (AC = n/2 A AB = BC == jc/4),
AB = BC <-» (V X)[sym (4, X, C) * BX = jt/2].
Here the conditions mid (A , #, C) and mex (A , B, C) require that A ^ C
and AC ^ n\1t and that B is the midpoint of the shorter or longer line
segment joining A and C (the "internal" or "external" midpoint). The
definition of sym (A, B, C) then gives a complete listing of the cases in
which A and C are symmetric with respect to B. Notice that, in the
definition of AB = BC} HA ^ C, then there are just two possible values
of X, the midpoints of the two segments joining A and C, and the polar of
either is perpendicular to AC at the other. If A = C, then X may be A or
74 RAPHAEL M. ROBINSON
any point on the polar of A , and B is completely arbitrary, as it should be.
The restriction to plane geometry may be removed by noticing that it
is possible to define the concept of a plane in ^>-space in terms of the
distance nj2. We can then define the relation AB = EC by applying the
previous method in a plane containing A, Bf and C.
THEOREM 3.3. In elliptic geometry, equidistance is definable in terms
of the distance d, for any d with 0 < d ^ n\2.
This can be derived from Theorem 3.2, by defining the distance n/2 in
terms of the distance d, but the details of the proof will be omitted.
Combining this result with Theorem 7.3, we see that the binary relation
AB = d is a suitable primitive notion for elliptic geometry if and only if
0 < d 5g jr/2 and cos d is algebraic.
4. Patch-wise congruence. Let any Euclidean or hyperbolic space be
given. Then we put
con (X\, X%, - — , Xm\ X\, Xz, • • -, Xmr) <->
Xi'Xz A X\X% = Xi'Xs A • • • A Xm-\Xm = Xm_lXm'.
That is, two finite sequences of points are called congruent if all the
corresponding distances are equal. The space has a certain property of
homogeneity expressed by the condition
' con (Xi, • • •, Xm\ Xi, • • •, Xmr) ->
(A Y)(V Y') con (Xi, '-,Xm,Y; X^, - - -,Xm', Y')f
which holds for all values of m. The only other fact about the given space
that we use in this section is that the space is unbounded.
The concept of congruence will now be extended to that of patch-wise
congruence. If c > 0, the formula
pat (c: X\, X2, • • •, Xm\ X\ , Xz', • • •, Xm')
will be used to denote that the two sequences X\t X2, • • • , Xm and X\*t
Xz, • • •, Xmf are patch-wise congruent, with separation constant c. This
formula is defined as follows. We start by considering any partition of the
indices 1,2, • • • , m. For this partition, we form the conjunction of all the
formulas XiXj = Xi'Xj for i and / in the same class, and of all the
formulas XtXj > c and Xi'Xj > c for i and / in different classes. The
disjunction of all these conjunctions, formed for all possible partitions,
is the required formula expressing patch-wise congruence.
BINARY RELATIONS AS PRIMITIVE NOTIONS 75
The formula pat(c:Xi, •••,^Tm;Xi/, • — , Xm') constructed in this
way actually expresses that the two sequences of points can be divided
into patches which are respectively congruent, such that the distance
between any two patches is greater than c. We shall now show that the
formula expressing the property of homogeneity mentioned above may be
extended to patch- wise congruence.
THEOREM 4.1. In any Euclidean or hyperbolic space, and for any
c > 0, we have
p*t(2c:Xi,---,Xm',Xl', •••,Xm')-+
(A Y)(V Y') pat (c: Xlf - - - , Xm, Y; AY, • • -, Xm', Y').
PROOF. Pick out one disjunct of the hypothesis which is valid. This
determines which points are to be considered as belonging to the same
patch. Now if Y is at a distance greater than c from all the points Xt,
then it may be considered as forming a new patch, and we may choose for
Y' any point at a distance greater than c from all the points XJ. Other-
wise, there is a unique patch, among the points X\, • • • , Xm, such that
Y is at a distance at most c from some point of the patch. Choose Y' in a
corresponding position relative to the corresponding patch of the points
Xi'.
THEOREM 4.2. Let d be a positive number. Let a Euclidean or hyperbolic
space be given. Let ai, <*2, • • • , aa be binary relations on this space, such that
for k = 1 , 2, • • • , q, we have
XY = X'Y' -> [*k(X, Y) «-> *t(X', Y')]
and
*k(X, Y) -> X Y ^ d.
Let $ be a formula with free variables X\, • • • , Xm, which is elementary in
terms of HI, • • • , ocq ; that is, the atomic formulas of <f> have the form X = Y
or *k(X, Y). Then
provided that </> does not contain more than n nested quantifiers. 5
5 It can be shown that 2nd is the smallest possible separation constant which can
be used here.
76 RAPHAEL M. ROBINSON
PROOF. By induction in n. The result is clear for n — 0, since on the
basis of the hypothesis about patch-wise congruence, we have, for any
possible i and /, either XiKj = Xt'Xj', or else both XiXj > d and
Xt'Xj' > d. Hence
Xt = X,~ X^ = Xj', ak(Xi9 Xf) ~ «*(*,', Xi)
for all values of k, and the conclusion follows.
We now assume the theorem for some value of n, and prove it for n + 1 .
It will be sufficient to consider the case in which
• • •, Xm, Y),
where \p is an elementary formula with the indicated free variables and
containing at most n nested quantifiers. (For the truth-value of any
admissible formula <f> can be determined from the truth-values of formulas
of this form.) Now by the inductive hypothesis, we have
pat (2nd: Xlt • - •, Xm, Y; Xlf> • • •, Xm', Y') -»
[v(Xif • • •, Xm, Y) <-> V(*i', • • -, Xm', Y')],
and according to Theorem 4. 1 ,
pat (2"+irf: XL • • -, Xm;Xi', •••, Xm') ->
(A Y)(V Y') pat (2^: Xlt • • -, Xm, Y; AV, • • -, Xm', Y').
Combining these results, we sec that
pat (2»+irf: Xi, • • •, Xm\ Xi', • • -, -Ym') ->
[(V Y)v(Xi, • • -, Xm, Y) ^ (V YX^!', - - -, Xm', Y')].
THEOREM 4.3. Under the same hypotheses on ai, a2, • • • , a^, equidistance
is not definable in terms of them.
PROOF. Suppose the relation X\X^ = X^X^ were definable in terms
of ai, • • • , ocq, using a formula containing at most n nested quantifiers.
Then, by Theorem 4.2, we would have
pat (2»d: Xl} X2t X*9 X^Xi', XJ, X8', ^4') ->
This is certainly false, since the hypothesis holds whenever we have
> 2»rf and Xi'XJ > 2»d for all i and /.
BINARY RELATIONS AS PRIMITIVE NOTIONS 77
THEOREM 4.4. In Euclidean or hyperbolic geometry, equidistance is not
definable in terms of any number of particular distances.
PROOF. For k = 1,2, • • • , q, let ocjc(X, Y) <-* XY = dk, where dk > 0.
Then «i, 012, •••,«« satisfy the hypotheses of Theorem 4.2, if we take
d = max (di, d%, • • •, dq). Now apply Theorem 4.3.
5. No binary primitives for Euclidean or hyperbolic geometry. We now
come to the question whether there are any binary relations <*i(X, Y), • • • ,
ocq(X, Y) which are suitable primitive notions for Euclidean geometry
(with a unit distance) or for hyperbolic geometry. To be suitable, they
should be definable in terms of equidistance and the unit distance in the
Euclidean case, and in terms of equidistance alone in the hyperbolic
case, and conversely. To show that this is impossible, we start by studying
binary relations which are definable in terms of equidistance and r
particular distances d\t d%t - • • , dr. (Actually, we need only r = 1, d\ — 1
in the Euclidean case, and r = 0 in the hyperbolic case.) In the hyperbolic
case, a preliminary theorem is needed.
THEOREM 5.1. In hyperbolic p-space, it is possible to introduce co-
ordinates (xi, x%, • • • , Xp), with xi2 + x%2 + * • • + Xp2 < 1, so that eAB
can be calculated from the coordinates of A and B using rational operations
and the extraction of square roots.
PROOF. In the interior of the unit sphere in Euclidean ^-space
introduce a new metric [A , B] by putting [A, B] = 0 if A = B, and
AR-BS
log 7
BR-AS
otherwise, where R and 5 are the two points where the line joining A and
B intersects the unit sphere. From the coordinates of A and B, we can
calculate successively the coordinates of R and S, the distances AR, BS,
BR, AS, and finally e{A^1, using only rational operations and the extrac-
tion of square roots. Now it is known that, with the metric just introduced,
the interior of the unit sphere in Euclidean ^-space becomes a model for
hyperbolic />-space. (See, for example, Hilbert and Cohn-Vossen [2], § 35.)
Thus the theorem restates, from a different viewpoint, what we have just
proved.
THEOREM 5.2. In a Euclidean or hyperbolic space, any binary relation
a(X, Y) which is definable in terms of equidistance and particular distances
78 RAPHAEL M. ROBINSON
d\, d%, • • • , dr, satisfies the condition
XY = X'Y' -> [*(X, Y) «-> x(X', Y')],
and, for some d > 0, one of the conditions
XY > d -+ oc(X, Y), XY > d -+ ^«(X, Y).
PROOF. The first conclusion is clear, since there is an isometric
mapping of the space onto itself which takes X into X' and Y into Y'.
This mapping preserves the equidistance relation and the particular
distances, and hence anything definable in terms of them.
We turn now to the second conclusion. Suppose that t > 0, and con-
sider the formula
(A X, Y)[XY = t -> *(X, Y)].
We can eliminate all point variables in favor of real variables, by intro-
ducing coordinates. In the Euclidean case, by simply squaring all equa-
tions that occur, we obtain an equivalent formula of elementary algebra,
containing only / and d\t d%, • • • , dr as free variables. In the hyperbolic
case, we use Theorem 5. 1 ; by a little manipulation, including the elimi-
nation of square roots by introducing additional existential quantifiers,
we again obtain an equivalent formula of elementary algebra, where in
this case e^ and edl, - - • , edr play the role of free variables.
Following the procedure of Tarski [6], all bound variables can be
eliminated, if we allow the introduction of inequalities (Tarski's Theorem
31). If numerical values are assigned to di, dz, - • • , dr, we see that there is
a real number d such that the resulting formula is either true for all t > d
or else false for all t > d. Thus the same alternatives hold for the displayed
formula with which we started. In the first case, we have XY > d ->
<x.(X, Y). In the second, taking account of the fact that the truth- value of
a(X, Y) depends only on XY, we see that XY > d -> -,a(X, Y).
THEOREM 5.3. In a Euclidean or hyperbolic space, it is impossible to
find binary relations ai, a2, • • • , aff, which are definable in terms of equi-
distance and particular distances, and in terms of which equidistance is
definable. Thus there are no binary relations which are suitable as the primi-
tive notions of Euclidean geometry (with a unit distance) or of hyperbolic
geometry.
PROOF. We may apply Theorem 5.2 to each of the relations <*#. By
replacing a* by -la* if necessary, we may assume that we have X Y > d ->
BINARY RELATIONS AS PRIMITIVE NOTIONS 79
, Y), and hence aA(X, Y) -> XY ^ dt for all values of k. The proof
is completed by applying Theorem 4.3.
6. Local definability of equidistance. Although, as shown in Section 4,
equidistance is not definable in terms of particular distances in Euclidean
or hyperbolic geometry, nevertheless equidistance is locally definable in
terms of a single given distance. 6 By a local definition of equidistance
AB = CD (in terms of a given distance d) is meant a formula which
provides a necessary and sufficient condition for this equality, on the
assumption that the distance between each two of the four points does
not exceed a prescribed bound. We shall see that this bound can be taken
arbitrarily large, although the formula required becomes longer as the
bound increases. (Throughout this section, d denotes an arbitrary
positive number.)
THEOREM 6.1. In Euclidean or hyperbolic geometry :
(a) The distance 2d is definable in terms of the distance d.
(b) Any one of the relations AB — d, AB ^ d, AB < d is definable
in terms of any other one.
(c) The local symmetry relation sym (A, B, C) A AB g h is definable
in terms of the distance d if and only if the distance h is definable in terms of
the distance d.
PROOF, (a) We may use the formula 7
AB = 2d <-> (V X)(A Y)[A Y = d A BY = d «-» Y - X],
6 At the time this paper was presented to the Symposium, I knew this result only
for Kuclidean or hyperbolic geometry of three or more dimensions. A few days
afterwards, A. Seidenberg pointed out to me that the linkage of Peaucellier, which
enables one to draw a line segment in the Euclidean plane, furnishes a local defi-
nition of collinearity in terms of a particular distance, and that this in turn leads to
a local definition of equidistance. Some time later, Seidenberg also succeeded in
extending the result to the hyperbolic plane. Subsequently, the author found a
different and simpler solution to this problem. The method used here can also be
adapted to the higher-dimensional hyperbolic spaces, and is presented below in this
extended form. The local definition of midpoint used in the proof of Theorem 6. l(c)
is a modified form of the definition suggested by Seidenberg for use in the hyper-
bolic plane.
7 This definition of the distance 2d in terms of the distance d uses an existential
and a universal quantifier. In Euclidean geometry, it is also possible to define the
distance 2d existentially in terms of the distance d, that is, by means of a formula in
prenex form containing only existential quantifiers. (In the two-dimensional case,
80 RAPHAEL M. ROBINSON
(b) Whichever of the three relations is given, we can easily define
AB <2d, since this expresses that the spheres of radius d about A and B
overlap. If the given relation is AB ^ d, then we may use the formula
AB < d <-» (A X)[AX ^ d -> BX < 2d]
to define AB < d, and hence AB = d can also be defined. A similar
argument applies in the other cases.
(c) We see that
0 < AB < 2d -> [sym (A, B, C) <-+ B ^ C
A (A X, Y)(AX = XB = BY = d A XV = 2d -> CY = d)].
Indeed, the possible values of X lie on the intersection of two spheres
AX = d and BX — d, and, since 0 < AB < 2d, these spheres actually
intersect. The possible values of Y are those symmetric to X with respect
to B. The only point C, other than B itself, which is at a distance d from
all such points Y is the point symmetric to A with respect to B. This
formula clearly leads to a suitable definition of the relation sym (^4 , B, C)
A AB 5* d, and the stated result then follows easily.
THEOREM 6.2. In Euclidean or hyperbolic geometry, the local Fieri
relation AB = #C ^ A can be defined in terms of the distance d if and only
if the distance h can be defined in terms of the distance d.
PROOF. The necessity of the condition follows from Theorem 6.1(b).
To prove the sufficiency, we need only show that the relation AB—BC ^d
is definable in terms of the distance d. The proof is divided into three
cases.
CASE 1 . Euclidean p-spa.ce, p ^ 3. It is easily seen that
AB = BC !g 2d *-» (V X, Y, Z)[AY =YC = 2d
A AX = XY = YZ = ZC = XB = BZ - d].
This is based on the idea of taking an isosceles triangle whose equal sides
are 2d and folding it along the line joining the midpoints of the two equal
use a network of equilateral triangles. In three dimensions, use twice the fact that
the diagonal of an octahedron is 2* times an edge, and similarly in higher di-
mensions.) Starting from this fact, it is possible to put the local definition of
equidistance in an existential form, and to define all algebraic distances existentially
in terms of the unit distance (thus sharpening Theorems 6.3 and 7.1). I do not see
any way of doing the corresponding things in the non-Euclidean cases.
BINARY RELATIONS AS PRIMITIVE NOTIONS 81
sides. The vertex remains at an equal distance from the two ends of the
base, and this distance may be any amount not less than half the base and
not exceeding 2d. By adjoining the condition AB ^ d, we obtain the
required relation AB = BC ^ d.
CASE 2. Euclidean plane. There is a well-known linkage, due to
Peaucellier, which can be used to draw a line segment. (See Kempe [3] or
Hilbert and Cohn-Vossen [2], § 40.) Choosing, for the lengths of all links,
distances definable in terms of d (for example, suitable multiples of d),
and considering three positions of the linkage, we obtain a local definition
of collinearity in terms of the distance d. Combining that with the formula
AB = BC *AC <2d<->
(V X, Y)[X ^ Y A AX = CX = AY = CY = d A col (B, X} Y)],
we easily obtain the required result.
CASE 3. Hyperbolic ^-space, p ^ 2. If p ^ 3, we could proceed much
as in Case 1 . However, we shall apply a different method, which does not
exclude the case p = 2, but which definitely uses the non-Euclidean
character of the space. In fact, we see that
(V X, Y)[sym (A, X, B) A sym (B, Y, C) A XY = d]}.
We have expressed the similarity of the triangles ABC and XBY, which
is impossible unless the triangles are degenerate. Indeed, in hyperbolic
geometry, the line joining the midpoints of two sides of a triangle is less
than half as long as the third side. Since sym (A, B, C) is locally definable,
we can obtain a local definition of col (A , B, C) , at least under the
restriction that AC = 2d. From this, we can get a local definition of
col (#1, #2, #3), without such a restriction, by considering three values
of B with the same A and C. We can then proceed to a local definition of
Pieri's relation as in Case 2.
THEOREM 6.3. In Euclidean or hyperbolic geometry, the local equi-
distance relation
can be defined in terms of the distance d if and only if the distance h is
definable in terms of the distance d.
82 RAPHAEL M. ROBINSON
PROOF. The condition is clearly necessary, and the sufficiency can be
derived from Theorem 6.2 by a suitable modification of the method used
in Section 2.
7. Definable distances. We shall now determine what distances t are
definable in terms of a given distance d, with or without the use of equi-
distance, or, in the non-Euclidean cases, in terms of equidistance alone.
We start by giving a few definitions valid in both the Euclidean and
hyperbolic geometries. (With some modifications, they can be used also
in the elliptic case.) The relation of equidistance is considered as given,
and notions previously defined in terms of equidistance are also used. In
the first place, we have
AB = CD + EF <-> (AB = CD A E = F) v (AH - EF A C = D)
v (V X)[bet (A, X, B) *AX = CD A XB = EF}.
We also wish to define perpendicularity. A special case is covered by the
formula
AC _L BC «-> A ^ C A B --£ C A (V A^sym (B, C, X) A AB = AX].
We can then proceed to the formula
AB _L CD <-> A ^ B A C ^ D A (V X, Y, Z)[XZ _[_ YZ A col (A, X, Z)
A col (B, X, Z) A col (C, Y, Z) A col (D, Y, Z)\,
which defines perpendicularity in general.
THEOREM 7.1. In Euclidean geometry, the distance t is definable in terms
of equidistance and the unit distance if and only if t is algebraic. The algebraic
distances are indeed definable in terms of the unit distance alone.
PROOF. Suppose that (A A, B}[AB = t <-> </>(A, B)] is a valid formula
of ^-dimensional Euclidean geometry, where <f>(A,B) is expressed in
terms of equidistance and the unit distance. By introducing coordinates,
it can be transformed into a formula of elementary algebra, with t as its
only free variable. By Tarski [6], the bound variables may be eliminated,
which leads to the conclusion that t must be algebraic.
It remains to show that all algebraic distances can be defined. We have
already defined AB = CD + EF, and we can define the product of two
BINARY RELATIONS AS PRIMITIVE NOTIONS 83
distances by the formula
AB = CD-EF^(A = #AC = Z))
v (V P, Q, R, S)[col (P, A, B) A col (P, Q, R) A P ^ A
A AQ _L PS A £# J_ PS A P0 = 1 A PA = CD A <?# = EF].
Using these definitions of sum and product of two distances, we can
express that a certain distance satisfies a given algebraic equation. By
the use of suitable inequalities, which are also definable, we can isolate
a particular root, and hence define that AB — t, where t is a given alge-
braic number.
If we arc given only the unit distance, but not equidistance, then
equidistance is nevertheless locally definable. All of the concepts used can
be defined locally, which is sufficient for the purposes of the proof.
(Notice that in the definition of the product of two distances above, we
expressed the parallelism of the lines AQ and BR by the existence of a
common perpendicular, and not by the non-existence of a point of inter-
section, so that this transition would be possible.)
THEOREM 7.2. In hyperbolic geometry, the distance t is definable in
terms of equidistance if and only if e* is algebraic.
PROOF. Using Theorem 5.1 and Tarski [6], we see that only such
distances can be definable in terms of equidistance. It remains to show
that all such distances are definable.
We have defined the relation AB = CD + EF, but the definition of
AB — CD-EF does not apply here. Indeed, this product formula is not
definable, since if it were, we could define the unit distance AB — 1,
which is impossible since e is not algebraic. But we shall show that it is
possible to define the two formulas
cosh AB = cosh CD + cosh EF, cosh AB = cosh CD -cosh EF.
We will then be able to express the condition that cosh AB satisfies a
given algebraic equation, and hence the condition that it is a given
algebraic number. Thus a distance t will be definable if cosh t is algebraic,
or, what is equivalent, if el is algebraic.
The definition of the second formula follows at once from the known
formula cosh c = cosh a cosh b connecting the sides of a right triangle.
Thus we have
coshAB = coshCD-coshEF~(AB = CD*E = F) v (AB = EF*C=D)
v (V X)[AX _L BX A AX = CD A BX = EF].
84 RAPHAEL M. ROBINSON
Also, since 2 cosh x cosh y = cosh (x + y) + cosh (x — y), we see that
2 cosh AB = cosh CD + cosh EF A CD ^ £F <->
(V P, <?, 7?, S)[cosh ,4£ = cosh PQ-cosh RS
A C/) = PQ + #S A EF + #S = P01.
which leads to a definition of 2 cosh AB = cosh CD + cosh £"JF. The
factor 2 on the left could be removed, if we were able to define the
relation cosh XY — 2. This can be done, for example, by a judicious
combination of the above formulas. Indeed, we see that
cosh AB = 2 <+ A ^ B A (V P, Q, R, S) [cosh PQ = cosh* AB
A 2 cosh AB = cosh 7?S + 1 A 2 cosh RS = cosh /!£ + cosh
Since cosh XX = 1 , we see that all the equations on the right are special
cases of the formulas which we have defined, so that this furnishes the
desired definition.
The proofs of the last two theorems will be omitted, since they do not
require any essentially new methods.
THEOREM 7.3. In elliptic geometry, the distance t is definable in terms
of equidistance if and only if cos t is algebraic.
THEOREM 7.4. The distance t is definable in terms of the distance d
(where d > 0, and in the elliptic case also d ^ n/2) if and only if the stated
condition is satisfied.
(a) Euclidean case : tjd is algebraic.
(b) Hyperbolic case: et is algebraic in terms of ed.
(c) Elliptic case : cos t is algebraic in terms of cos d.
These results are unchanged if the relation of equidistance is also considered
as given.
Bibliography
[1] BETH, E. W. and A. TARSKI, Equilaterality as the only primitive notion of
Euclidean geometry. Indagatiories Mathematicae, vol. 18 (1956), pp. 462-467.
[2] HILBERT, D. and S. COHN-VOSSEN, Anschauliche Geometrie. Berlin 1932,
viii-f-310 pp. [Knglish translation: Geometry and the imagination. New York
1956, ix + 357 pp.]
[3] KEMPE, A. B., How to draw a staight line] a lecture on linkages. London 1877,
vH-51 pp.
BINARY RELATIONS AS PRIMITIVE NOTIONS 85
[4] FIERI, M., La geometria elementave istituita sulle nozioni di 'punto' e 'sfera'.
Memorie di Matematica e di Fisica della Societa Italiana delle Scienze, ser. 3,
vol. 15 (1908), pp. 345-450.
[5J ROYDEN, H. L., Remarks on primitive notions for elementary Euclidean and non-
Euclidean plane geometry. This volume, pp. 86—96.
[6] TARSKI, A., A decision method for elementary algebra and geometry. Second
edition, Berkeley and Los Angeles 1951, iv-f 63 pp.
[7] , What is elementary geometry ? This volume, pp. 16—29.
Symposium on the Axiomatic Method
REMARKS ON PRIMITIVE NOTIONS FOR ELEMENTARY
EUCLIDEAN AND NON-EUCLIDEAN PLANE GEOMETRY
H. L. ROYDEN
Stanford University, Stanford, California, U.S.A.
Introduction. The purpose of the present paper is to explore some
relationships between primitive notions in elementary plane geometry
with a view to determining the possibility of defining certain notions in
terms of others. All of our primitive notions are predicates whose argu-
ments are the primitive elements (points or points and lines) and we say
that a primitive F can be defined in terms of a primitive G relative to a
deductive system S if
(x,y,z, ...)[F(x,y,z, ...) o&(x,y,z, ...)]
is a theorem in S where 0 is a sentential function involving only G and
logical terms in its formation (cf. [10]).
Whether F is definable in terms of G depends not only on the deductive
system S, but also on the logical basis used and our results are sometimes
different if we use only the restricted predicate calculus rather than a
logic which contains the theory of sets. Definitions using only the re-
stricted predicate calculus will be called elementary and the others set-
theoretic. In the present paper all of our definitions are elementary except
for part of Section 5 where there is some discussion of the possibility of
definitions using variables ranging over finite sets of points.
We consider here both Euclidean and non-Euclidean geometry and
use a set of axioms equivalent to Hilbert's without the axioms of com-
pleteness and of Archimedes. We shall sometimes supplement these with
an axiom (PI 2) to the effect that any line through a point inside a circle
has a point in common with the circle. One of my purposes here is to show
the role played by this axiom in the definability of concepts in elementary
geometry.
Theorem 1 shows that for Euclidean and elliptic geometry this axiom
plays an essential role in the possibility of defining order in terms of
collinearity. With regard to hyperbolic geometry the situation is markedly
different and order can be defined in terms of collinearity independently
86
REMARKS ON PRIMITIVE NOTIONS 87
of this axiom. As Menger [3, 4] has pointed out, the whole of hyperbolic
geometry can be built on the notion of collinearity. We use here the
elegant definition of order given by Jenks [2], but our treatment of the
definition of congruence differs somewhat from that of Menger and his
students in that we first define orthogonality and use it in the definition
of congruence.
1. The basic elementary geometries. Euclidean geometry. We shall
consider two systems for elementary Euclidean plane geometry. The first
is the system £P which iises the undefined primitives /? and d and consists
of all consequences of the axioms PI -PI 2 listed below. Intuitively,
P(xyz) has the meaning "x, y, and z are collinear and y is between x and z,"
while 6(xyzw) has the meaning "the segment xy is congruent to the segment
zw." In terms of these notions we define the notion of collinearity.
=df p(xyz) v p(yzx) v p(zxy) ;
and parallelism:
n(xyuv) — df
Thus n(xyuv) states that (x, y) and («, v) are pairs of distinct points lying
on distinct parallel lines. Our axiom system for & corresponds to Hilbert's
axiom system, with the exclusion of the axioms of Archimedes and of
completeness, and is equivalent to the Axioms A 1-1 2 of Tarski [12]. In
fact, our axioms are taken directly from Tarski 's paper, except that our
version P7 of Pasch's axiom is stronger than Tarski's A7 and together
with the remaining axioms it implies Tarski's A 12, which is accordingly
omitted from our list.
AXIOMS FOR &
PI (x)(y)[p(xyx) => x = y]
P2 (x)(y)(z)(u)[f(xyu) & P(yzu) ^ p(xyz)]
P3 (x)(y)(z)(u)[p(xyz) & ft(xyu) & (x ^ y) => fi(xzu) v p(xvz)]
P4 (x)(y)d(xyyx)
P5 (x){y)(z)[6(xyzz) *> (x = y)]
P6 (x) (y) (z) (u) (v) (w) [d(xyzu) & d(xyvw) => 6(zuvw)]
P7 (t)(x)(y)(z)(u) (3v)[fl(ztu) => l(ytv) & {ft(zvx) v ftuvx)}]
88 H. L. ROYDEN
P8 (t)(x)(y)(z)(u)(3v}(3w)[p(xut) & P(yuz) & (x ^ u) =>
P(xzv) & P(xyu>) & P(vtw)]
P9 (x) (y) (z) (u) (xf) (yf) (zf) (u'} [d(xyx'yf) & d(yzy'zf) & d(xitx'ur) &
d(yuy'u') & p(xyz) & p(x'y'z') & (x ^ y) => d(zuz'u'}~}
P10 (x)(y)(u)(v)(3z)[p(xyz) &d(yzuv)]
Pll (ax)(3y)(3z)[~i(xyz)]
It should be noted that in the presence of the other axioms, P8 is
equivalent to the following axiom :
P8' (x)(y)(z)(u)(v)[n(xyzu) & n(xyzv) => A(zuv)]
The existence axioms in ^ guarantee the existence of those points which
are the intersections of lines and those that can be constructed by the
use of a "transferer of segments'' (P10). If we wish to have all points which
can be obtained by the use of compasses, we must add the following
axiom :
P 1 2 (x) (y) (z) (%') (zr) (u) (3/) [d(uxuxf) & 6(uzuz') &
P(uxy) & p(xyz) => 6(uyuy') & P(xfy'z')]
This is precisely Tarski's axiom A 13', and the geometry having P 1-1 2 as
axioms will be referred to as ^*. It is equivalent to Tarski's system <^y'.
In the presence of the remaining axioms the axiom PI 2 is equivalent to
the axiom PI 2' which is stated entirely in terms of the notion ft and its
derived notions A and n:
P12' (X)(y)(z)(3u)(3v)(3w)[p(xyz) & (x ^ y) & (y ^ z) =>
[X(xyw) & X(xuv) & 7i(uyvw) & n(uwvz)}
If g is an ordered field, we define the (two-dimensional) coordinate
geometry (£(gf) as the set of all ordered pairs x = (x\, x$ of elements of 3f
with the notions ft and d defined as follows:
P(xyz) =
0 < (xi - yi)(yi - *i) & 0 < (x2 -
d(xyzu) =df [(xi - yi)2 + (x2
If 3 has the property that the sum of two squares is a square, we call $
Pythagorean field. If $ is a Pythagorean field, then (£($) is a model for 0
REMARKS ON PRIMITIVE NOTIONS 89
Conversely, any model for & is isomorphic to K(g) for some Pythagorean
field 3f . The models for ^* are isomorphic to the geometries &($) where %
is Euclidean, i.e. has the property that every positive element is a square.
Conversely, each such geometry is a model for ^*.
Elliptic geometry. One can give a similar set of axioms for elliptic plane
geometry except that order is now expressed by means of a four-place
relation y(xyzw) with the meaning that x, y, z, and w are collinear and the
pair (x, y) does not separate the pair (z, w). Again we get two systems, $
and <^*, depending on whether or not we include the axiom corresponding
to PI 2. This axiom is the following:
E 1 2 (x) (y) (z) (w){y(xyzw) o (3r) (3s) (3t) (3u) (3v) [l(xyz) & Hyzw) &
l(xyt) & A(xuv) & l(wrs) & A(uyr) & X(uts) & h(vtr] & A(vzs)]}.
Let 3f be a Pythagorean field. Then by the elliptic geometry ©(fjf) we
mean the set of ordered triples x = (x\, xz, #3) ^ (0, 0, 0) from $f, where
(axi, ax2, axz) is taken to be equivalent to (x\t X2, #3) for a --£ 0. We
define h(x, y, z) to mean the triple x, y, and z are linearly dependent] d(xyzw]
to mean that
The notion y can then be defined in terms of the order in gf so that
becomes a model for $ and all models of $ are isomorphic to ®(3r) f°r
some 5- The geometry 6®) is a model for ^* if and only if $ is Euclidean.
In elliptic geometry we can introduce the binary relation oc(xy) of po-
larity between points which indicates that one point lies on the polar of
the other. We can define collinearity in terms of a by the following
equivalence :
o (3*)[ato) & aty & ate].
Hyperbolic geometry. By the elementary hyperbolic geometry 3tf we
mean the geometry which follows from axioms PI -7 and P9-1 1 together
with the negation of P8. If we assume also PI 2, then we call the geometry
f*.
It should be remarked that the notion n which we defined for & and
* here means non-intersection rather than parallelism. Parallelism will
90 H. L. ROYDEN
be denoted by ri and is defined as follows:
n(xyzw) =df (u)(3v){7i(xyzw) & [p(xuw) ^> p(zuv) &
In ,/f the axiom P 1 2 is equivalent to the following axiom which asserts
the existence of parallels:
H 1 2 (x) (y) (z) (3u>)] [~X(xyz) => n'(xyzw)]
Let 55 be a Pythagorean field and e be a positive element in 55 such that
for every x, y e 55 with x2 + y2 < e there is a z e 55 such that z2 =
e — x2 — y2. Then a model ,§(55, 0) f°r -^ is obtained by taking all pairs
x = (x\, x%) of elements from 55 subject to the restriction x\2 + #22 < e,
where the basic relations are defined by the following conditions:
ft(xyz) =df f(*i — yi)(y2 — 22) = (*2 — 1X2) (yi — *i) &
0 < (*i - yi)(yi - *i) & 0 < (^2 - y2)tV2 - ^2)],
and
(e—xiyi—x2y2)2 __ (e
^ _ "1
__W22) J '
Every model of Jf1 is isomorphic to some $)(J5, ^). If e is a square then 55 is
Euclidean and by a change of coordinates we may take e = 1. Every
model of ,#"* is isomorphic to §(55) = §©» 0 f°r some Euclidean field 55-
2. Relations between order and collinearity. We have defined collincarity
in our geometries in terms of order, i.e. in terms of ft in the Euclidean and
hyperbolic geometries and in terms of y in the elliptic geometries. In this
section we consider the possibilities of the definitions in the converse
direction. The following propositions show what can be accomplished in
this direction.
PROPOSITION 1 . In &* we have the following equivalence :
P(xyz) v f$(xzy) o (3u)(3v)(3w)[(x = y) v (x — z) v (y = z) v
(h(xyz) & X(xyw) & h(xuv) & n(uyvw) & n(uwvx)}].
PROPOSITION 2. In $* we have the following equivalence l :
y(xyzw) o (3r)(3s)(3t)(3u)(3v)[l(xyz) & X(yzw) & Ji(xyt) &
l(xuv) & A(wrs) & l(uyr) & X(uts) & A(vtr) & A(vzs)\
1 This equivalence was first pointed out and used by Pieri [6] to define order in
Projective Geometry!
REMARKS ON PRIMITIVE NOTIONS 91
PROPOSITION 3. In 3? we have the following equivalence 2:
fi(xyz) o (u)(v)(3w)[A(xyz) & X(wyv) & {h(wux) v A(wuz)}].
THEOREM 1 . Order can be defined in terms of collinearity in <?>*, &* and
Jf . On the other hand, order cannot be defined in $ and 0* on the basis of
collinearity and congruence.
The possibility of defining order in <f *, ^* and Jf follows from Propo-
sitions 1-3. To show the impossibility of defining order in <^ and 2fi solely
in terms of collinearity, we shall use the method of Padoa (cf. [10] and
[11]) and construct the following model: Let % be the smallest field con-
taining all algebraic numbers, an indeterminant w ,and closed under the
operation of taking the square root of a sum of squares. Thus each element
of % is an algebraic function F(co) with algebraic coefficients. We make ft
into two distinct ordered fields $1 and $2 by taking two different real
transcendental numbers a>\ and mz and inf^i setting F(CJ) > 0 if F(coi) >0
and in ftz setting F(eo) > 0 if F(a)z) > 0. If we form the coordinate
geometries (£($i) and (£($2) (or equivalently ®(3fi) and ®(3?2))» then the
natural isomorphism is a (1-1) mapping which preserves collinearity and
congruence but not order.
3. The notion of orthogonality. Scott [9] has introduced the notion
T(xyz) whose meaning is that x, y, and z form a triangle with a right angle at
x. This notion can be defined in terms of congruence as follows:
r(xyz) =df (3u)(3v)[(u ^ y) & (u ^ z) & (v ^ y) & (v ^ z) & 6(xyxv) &
d(xzxu) & d(yzuv) & d(yzzv) & 6(yzyu)]
In this section we shall show that collinearity and congruence can be
defined in terms of r. For collinearity we have the following proposition :
PROPOSITION 4. In 6a, 3? , and & we have
h(xyz] o (3r)[r(rxy) &r(rxz)].
In order to define the congruence relation 6, we introduce the auxiliary
relation ^ defined as follows :
& 6(xyxz)].
2 This definition of order was given by Jenks [2].
92 H. L. ROYDEN
It is easy to define d in terms of the notion of two points being equidistant
from a third (cf. Fieri [7]). But this latter notion can be defined in terms of
ju, and T by the following proposition due to Scott :
PROPOSITION 5. In <£, 0*, and 3tf we have
d(xyyz) o (3r)[/t(yxz) v {p(rxz) & r(rxy)}].
Thus we can define d from r if we can define ^ from r. This is accom-
plished by the following two propositions, the first of which is due to
Scott. Considerations similar to the second are found in Robinson [8],
Section 3.
PROPOSITION 6. In 0> we have
p(xyz) o {[x = y & x ~ • z] v (3u) (3v) \r(uyz] & r(vyz) & r(yuv) &
r(zuv) & r(xyu) & r(xyv) & r(xzu) & T(XZV)]}.
PROPOSITION 7. In $ and 3F we have
u(xyz) o {[x = y & x = z] v (r)(3u)(3v)(3s) [r(xry) ^> r(xrz) &
r(yxu) & r(zxv) & r(rxu) & r(rxv) & r(vzs) & r(uys) & h(xrs)]}.
These propositions together with the example at the end of the previous
section give us the following theorem :
THEOREM 2. In <£, J^, and 0* the notions of collinearity A and con-
gruence d can be defined in terms of r. Thus in £*, &?*, and 3? we may use
the relation r as the sole primitive notion. On the other hand, T does not suffice
for the definition of order in $ and £P.
4. Collinearity as the sole primitive in «#**. In this section we shall show
that in &* the notions of congruence can be defined in terms of col-
linearity A (cf. [1], [4], and [5]). Since we have shown in the previous
section that the notion r of orthogonality can be used as the sole primitive
in 3tf \ it will suffice to show that T can be defined in terms of A.
We begin by using an auxiliary relation y) defined as follows :
y>(xyuv) =df (3w)(3u')(3v')[n'(xyuw) &n'(xyvv') &n'(yxuu') &
n(yxvw) & n(xwu'u) & 7i'(xwv'v}].
The meaning of y) can best be seen by using a model 3?(2f, 1) of Jff* and
REMARKS ON PRIMITIVE NOTIONS 93
assuming without loss of generality that x has coordinates (0, 0). The
field $ is of course a Euclidean field. The definition of y(xyuv) then states
that uv and xy are diagonals of a quadrilateral in (£(5) whose third dia-
gonal passes through x. Since, in the Euclidean geometry of (£($)* x is
the midpoint of the diagonal xy, we must have uv parallel to xy in the
Euclidean sense.
With the above explanation in mind we see that for points x, y, z of
§($, 1) with x — (0, 0), zx will be perpendicular to xy in the Euclidean
geometry of K(3f) it and only if there are points u, v, r, s, and t such that
the formulas \p(xyuv), y)(xzrs), 7i'(uvsr)t n'(xtrs), and n'(txvu) hold in
§(gf, 1). However, the Euclidean and hyperbolic notions of a right angle
coincide at the origin of (£($). Thus we have the following proposition:
PROPOSITION 8. In Jtf* we have
r(xyz) o (3u)(3v)(3r)(3s)(3t)[y(xyuv) &y>(xzrs) &cn'(uvsr] &
n'(xtrs) & n'(txvu)].
Since ri was defined in terms of X alone, and since we have seen that r
can be used as the sole primitive notion in 3F , we have the following
corollary :
COROLLARY 3. In <$?* collinearity can be taken as the sole primitive
notion.
5. Units of length. In the elliptic and hyperbolic geometries we have
natural units of length, and the question immediately arises whether or
not the notion of two points being at a unit distance can serve as a
primitive. In the elliptic geometry the most natural distance to take is
one-half the length of a straight line. We call this distance P the polar
distance, and define the relation, <x(xy), to mean that x and y are at distance
P. This notion is easily defined in terms of congruence and collinearity,
and conversely we can define orthogonality in terms of it as follows :
r(xyz) o (3^^)(3v){^x(ux) & OL(UZ) & QL(UV) & OL(VX) & <x.(vy)}.
This together with the example in Section 2 gives us the following
proposition :
3 I suspect that this corollary is still true if Jjf* is replaced by 3F , but I have not
carried out a proof. The fact that there are no parallels in a model of JV which is not
a model for JP* complicates considerations of this sort, but the method of Menger
in [5] may be applicable.
94 H. L. ROYDEN
PROPOSITION 9. The binary relation a of two points being at the polar
distance can be used as the sole primitive notion in $* but not in $.
On the other hand, if we use the notion a' (xy) of two points being at a
distance less than P/2, we may define <x.(xy) as ~(3u)[a fat) & a' (My)].
Thus in $ we can define collinearity and congruence in terms of a' and
it is not too difficult to define order in terms of a'. Thus we have the
following :
PROPOSITION 10. In $ the binary notion a' of two points being closer
than half the polar distance may be used as the sole primitive.
Robinson [8] has shown that in ^* with a unit of length introduced
as a new primitive we cannot use the unit of length to define collinearity
or congruence in elementary terms. If the points x, y, and z are within
a fixed integer multiple of the unit distance of one another, then as
Seidenberg has pointed out the collinearity of xyz can be defined in ^*
in terms of the relation of two points being at a unit distance. This
definition follows from the principle of the Peaucellier inversor (cf.
Robinson [8], Section 6). If we enlarge our logical basis to include finite
sets of elements and add to ^* the axiom of Archimedes, then we may
use the unit of length as a sole primitive. Similar results hold for hyper-
bolic geometry.
6. Geometries with points and lines as basic elements. It follows from
Robinson's results in [8] that it is not possible to find a binary relation
which will serve in elementary terms as the only primitive notion for M**
and ^* even if we adjoin a unit of length to ^*. We may use a single
binary primitive notion for these geometries, however, if we cease to
regard them as elementary statements about relations between points and
instead regard a geometry as a class of statements about points and lines
and relations between them. Thus, we can define a hyperbolic geometry
3tf* which uses the single primitive e of incidence between point and line.
In terms of e we define the unary notion p(x) of being a point as follows:
P(*) =df (y)(z)(3u)[{e(xu) & e(yu)} v (e(zy) => e(zu) & e(*w)}].
With this we define collinearity among points by
h(xyz) =df (3w){p(x) & p(y) & e(xw) & e(yw) & e(zw)}.
We now add the axioms and definitions of 30?* (together with some
REMARKS ON PRIMITIVE NOTIONS 95
additional axioms to ensure that elements which are not points are lines) .
We then have a geometry which is isomorphic to Jf* when relativised to
statements which only contain points as variables.
A similar procedure is possible for Euclidean geometry with a unit of
length. Let £(xy) be the binary relation which states that x is a point at
unit distance from the line y or else y is a point at unit distance from the
line x. As before we define a point by the condition:
P(*) =df (y)(z)(3w)[{£(xu) & f (y«)} v {[(zy) => f(*«) & CM}].
We can define col linearity by noting that five distinct points are collinear
if they are all at a unit distance from each of two distinct lines. From
this we can construct a point geometry corresponding to &** with a unit
of length.
The method of Lindenbaum and Tarski [11] enables one to show
that in Euclidean geometry without a unit of length there is no binary
relation between points and lines from which we can define congruence.
Bibliography
[1] ABBOTT, J. C., The protective theory of non-Euclidean geometry. Reports of a
Mathematical Colloquium, University of Notre Dame Press (1941—1944), pp.
13-51.
[2] JENKS, F. P., A set of postulates for Bolyai-I^obatchevsky geometry. Proceedings
of the National Academy of Sciences, vol. 26 (1940), pp. 277-279.
[3] MKNGER, K., Non-Euclidean geometry of joining and intersecting. Bulletin of
the American Mathematical Society, vol. 44 (1938), pp. 821-824.
[4] f A new foundation of non-Euclidean, affine, real protective and Euclidean
geometry. Proceedings of the National Academy of Sciences, vol. 24 (1938),
p. 486.
[5] , Neiv protective definitions of the concepts of hyperbolic geometry. Reports
of a Mathematical Colloquium, University of Notre Dame Press Series 2,
no. 7 (1946), pp. 20-28.
[6] FIERI, M., / principi della geometria di posizione composti in sistema logico
deduttivo. Mcmorie della Reale Accademia delle Scienzc di Torino, vol. 48
(1899), pp. 1-62.
[7] , La geometria elementare istituita sulle nozioni di 'punto' e 'sfera'. Memorie
di Matematica e di Fisica della Scienze, scr. 3, vol. 15 (1908), pp. 345-450.
[8] ROBINSON, R. M., Binary relations as primitive notions in elementary geometry.
This volume, pp. 68-85.
96 H. L. ROYDEN
[9] SCOTT, D., A symmetric primitive notion for Euclidean geometry. Indagationes
Mathematicae, vol. 18 (1956), pp. 457-461.
[10] TARSKI, A., Some methodological investigations on the definability of concepts.
Logic, Semantics, MetamathematiCvS, Oxford 1956, art. X.
[1 1] , On the limitations of the means of expression of deductive theories. Logic,
Semantics, Metamathematics, Oxford 1956, art. XIII (joint aticle with A.
Lindenbaum).
[12] , What is elementary geometry ? This volume, pp. 16-29.
Symposium on the Axiomatic Method
DIRECT INTRODUCTION OF WEIERSTRASS HOMOGENEOUS
COORDINATES IN THE HYPERBOLIC PLANE,
ON THE BASIS OF THE ENDCALCULUS OF HILBERT i
PAUL S/ASZ
Lor and Eotvus University of Budapest, Budapest, Hungary
Introduction. In the present paper let any system of "points" and
"lines" be called hyperbolic plane for which, besides the groups of axioms
of incidence, of order and of congruence of plane I, II, III of Hilbert [3|,
[4 1 the following two axioms are valid:
AXIOM IV]. Let P, Q be two different points in the plane and QY a half-
line on the one side of the line PQ, then there exists always one half-line PX
on the same side of PQ that does not intersect Q Y, while every internal half-
line PZ lying in the <£ QPX cuts the half-line QY (Fig. 1).
Fig. 1
Fig. 2
AXIOM 1V2. There exists a line SQ and a point PO oiitside it in the plane,
for which two different lines could be drawn through PQ that do not intersect
so (Fig. 2).
I have shown [7], [8] that these axioms imply the following theorem.
1 A more detailed exposition has been published in German (see [12]).
97
98 PAUL SZASZ
THEOREM. // s is an arbitrary line and P an arbitrary point outside it,
then the lines drawn through P and intersecting s, form the internal lines of a
certain <£ (pi, p2) (Fig. 3). These lines pi, p2, which do not intersect s any
more, are called parallels to s through P.
Fig. 3
This Theorem was laid down by Hilbert [3] as Axiom IV. The Axioms
IVi, IV2 mentioned above, form together with the axiom-groups I, II,
III apparently weaker assumptions than those of I, II, III, IV by the
quoted author.
In the work cited above Hilbert called "ends" the points at infinity
of the plane defined by any pencil of parallel lines. A line possesses, in
consequence of the above Theorem, always two ends. After the proof of
the fundamental theorem, according to which two lines neither inter-
secting each other nor being parallel, must have a common perpendicular,
Hilbert was able to prove also the existence of that line which possesses
two prescribed ends. From this it follows, that a determined perpendicular
can be dropped on a line from an end not belonging to it. From among the
preliminary theorems, stated by Hilbert for his so called endcalculus, I
wish to stress only the one just mentioned. This endcalculus I am going
to explain below, in § 1 .
The way sketched by Hilbert [3] for the construction of hyperbolic
geometry in the plane, leads through projective geometry. In contrast
to that way there will be created in the present paper a completely
elementary construction of hyperbolic plane geometry by means of direct
introduction of certain homogeneous coordinates and an independent
foundation of hyperbolic analytic geometry. Henceforth these coordinates
will be called the Weierstrass homogeneous coordinates, because they are
HOMOGENEOUS COORDINATES IN HYPERBOLIC GEOMETRY
99
identical with the well-known ones, if one assumes the axioms of conti-
nuity, instead of Axiom IVi, making the incomplete axiom-system
complete [9]. This construction of hyperbolic geometry does not depend
on hyperbolic trigonometry, the latter being a consequence of the analytic
geometry of the hyperbolic plane, founded here. 2 Neither do I make use
of Euclidean geometry, and therefore my exposition may be called an
independent elementary foundation of hyperbolic plane geometry.
1. The endcalculus of Hilbert. The distance-function @(f) and those
developed from it. The endcalculus of Hilbert, somewhat altered for my
purpose, follows.
Let a right angle in the plane be given with the vertex 0, the sides of
which as half-lines have the ends Q, E (Fig. 4).
Fig. 4
The end Q (called by Hilbert oo) will be distinguished, and the end-
calculus defined for the ends different from Q. Such an end a should be
called positive when the lines u& and EQ are lying on the same side of
the line &Q, and in case these lines lie on different sides of @Q, the end a
2 For the case of the assumption of the axioms of continuity, see Szasz [10].
100
PAUL SZASZ
is called negative. The other end of the reflection of the line ocD in OQ
should be denoted with —a, and the other one of OQ with 0. The addition
of the ends is defined by Hilbert as follows.
Let a and ft be ends differing from Q. The reflections of 0 in <x& and fiQ
should be denoted with 0rt, 0$ respectively (Fig. 5). The middle point of
Fig. 5
the segment O^Op being denoted with M, we define as the "simi a + /?" the
other end of the line MQ.
The definition of the product might be expressed simpler by intro-
ducing, unlike Hilbert, the following distance-function that is going to be
essential all through our treatment (cf. S/asz [11]).
Directing the line OQ towards Q, let us draw a perpendicular to OQ through
the end-point A of the segment OA — t regard being paid to sign. Let the
positive end a of this perpendicular be designated with (&(t) :
(1) a = <£(<)
(Fig. 6). Evidently any positive end a corresponds to one and only one
distance t with sign.
HOMOGENEOUS COORDINATES IN HYPERBOLIC GEOMETRY 101
Using the designation (1) we define as the "product G\GZ' of the positive
ends ai = (&(ti) and #2 = &(tz) the end <&(t\ + tz), i.e.
(2) @<
Fig. 6
Fig. 7
(Fig. 7). Further we agree, that for positive ends a,
(3) a(- /?) = (- a)/* ==- «0, (-«)(- 0)
«n^ /or «ny ^w^ differing from Q, there should hold
(4) f-0 = 0-f = 0.
102 ' PAUL SZASZ
Thus we have given the definition of the multiplication of ends differing
from Q in every case, this being equivalent to Hilbert's definition.
The positive end E is by designation (1) (£(0), playing the part of the
positive unity since according to (2) (£(/)($(0) = (£(0)(£(£) = ($(*). That's
why we introduce the designation
(5) C(0) = 1,
which by (2) may be written also as
(5*) ®W@(-0 = 1.
The end designated with 0, which, according to (4), under multi-
plication plays the part of zero, behaves under addition also like zero,
because evidently for any end differing from Q holds
(6) f + o = 0 + | = f
and
(7) f +(-£)= 0.
D. Hilbert showed in his work (cited above) that in the endcalculus defined
in such a way, the familiar laws are valid concerning the four rules of
arithmetic. Or, using a modern expression : the ends differing from Q form a
commutative field. This field moreover has the fundamental property of
any positive end being a square. Indeed in the sense of (2), we have
*<O»HT;.
The field of ends different from Q, can be made an ordered field by the
following agreement : let a be called greater than ft (ft less than a) in symbols
a > ft (ft < a), in case the end a — ft is positive. One is easily convinced,
that for positive ends a, ft in case of a > ft the line ftQ lies between the lines
OQ, ocQ, and vice versa. From this it results that (&(t) > 1 if t > 0, and
then it follows at once that in general
For the sake of brevity it is also suitable to introduce besides the
distance-function (£(t) the following ones too :
u ®(— t]
\ ' O/.t\
(9)
S(t)
2
Sift
T(0 =
C(t)
HOMOGENEOUS COORDINATES IN HYPERBOLIC GEOMETRY 103
While @(/) is the analogue of the exponential function, these latter
distance-functions are the analogous of the hyperbolic functions. For the
first two holds e.g. the fundamental formula
(10)
and also the formulas
(1 1)
and
C(a + 6) = C(a)C(b) +- S(a)S(b)
(12) S(a + b) = S(a)C(b) + S(b)C(a)
are valid for them.
Also, these distance-functions in (9) remind us of the hyperbolic func-
tions, just as the distance-function @(£) reminds us of the exponential
function, e.g. it satisfies the inequality (8).
Fig. 8
2. The Weierstrass homogeneous coordinates of a point. An arbitrary
point P in the plane may be characterized (Fig. 8) with the two data
mentioned below. One of them is the other end of line PO, let it be a.
a
However 0a being the reflection of (9 in the line with the ends — , Q,
104 PAUL SZASZ
the other end of the line 6 0Q is according to the definition of the sum of
or or ^
ends (§ 1 ) , 1 = a, that is to say the line aQ is the reflection of OQ
a
in the previous line with the ends and Q. Consequently the reflection
P' of P in this line joining the ends — and Q, lies in OQ. Now the distance
OP' — t taken with sign on the line &Q directed towards £2, is the other
datum, evidently determining P together with the end a mentioned
before. These data /, a should be called mixed-coordinates of point P. By
means of these may be proved the following
THEOREM. The points of the hyperbolic plane and the end-triads
(x\, x>2, xz), built with ends differing from U for which holds
(1) *3a-*22-*ia= 1
and
(2) A-3 > 0,
are put in one-to-one correspondence. This correspondence might be produced
by making each point (t, a) given in mixed-coordinates, correspond to the
end-triad
(3)
= S(t) + i
= <&(- t)
- C(t) + i
The concept of inequality (§ 1) is made use of in the proof.
The ends x\, x%, #3 in (3) should be called Weierstrass homogeneous
coordinates of the point the mixed-coordinates of which arc t, a. From
(3) follows for the case t — 0, a = 0, that the Weierstrass homogeneous
coordinates of point 0 are
(4) *i = 0, x2 = 0, #3=1.
Later on, for the transformation of the coordinates, it becomes of
fundamental importance, that for any two points (xi, %%, #3) and (x\y xz, x 3)
holds
(5) #3^3 — #2*2 — xixi > 0.
HOMOGENEOUS COORDINATES IN HYPERBOLIC GEOMETRY 105
3. The equation of the line. Weierstrass homogeneous line coordinates.
The derivation of the equation of the line may be based upon the two
Lemmas of Hilbcrt [3] mentioned below.
LEMMA 1 . a, ft being ends different from Q, the reflection of the line a/2
in ftQ is the line joining the end 2ft — a with Q.
LEMMA 2. For the ends a, ft of a line that goes through (9 holds cx.fi— — 1 .
This plainly follows from the fact, that if from among these ends the
positive one is a = (&(t), then the other one is evidently ft = — (£(— t) ;
their product is really — g(— /)(£(/) = — g(0) ----- 1.
Fig. 9
Let us consider first a line possessing the ends <?, r\ differing from Q
(Fig. 9). Let an arbitrary point of this line be given in mixed-coordinates
a
(§ 2) P(t, a). Then by reflecting the plane in the line that joins the end -
z*
with Q and after that translating it along OQ by the piece — tt the point
P goes into 0. The ends f , r\ by this reflection go into the ends a — £,
a — rj, respectively, according to Lemma 1, and the latter ends go into
(&(— t)(a — f), ©(— t)(a — rj), respectively, as follows from the definition
of the product of ends (§1). Since this line goes through point 0 already
(because P is turned into 0) , the product of these two ends due to Lemma
106 PAUL SZASZ
2is(§l,(2))
(1) e(-2)(
It may be seen at once, that conversely, if (1) holds for a certain point
(t, a), then this point lies on the line £77. That is to say (1) is the equation
of the line connecting the ends £ , rj expressed in mixed-coordinates.
Now the equation ( 1 ) can be transformed into Weierstrass homogeneous
coordinates x\, x%, x%. Namely, we obtain from formulas (3) of the pre-
ceeding section, by multiplying (1) with ®(rf), that the line joining the ends
£ > ?! differing from Q, has the equation
(2) (ft; - l)*i + (£ + rj)x2 - (ft; + l)*s = 0
in Weierstrass homogeneous coordinates.
In mixed-coordinates t, a the equation of the line j\Q with the end 77, is
evidently a — r\ = 0. Multiplying by &(— t) and writing in terms of the
coordinates x\, X2, #3 we see, that the equation of the line connecting the end
r) with Q is in Weierstrass homogeneous coordinates
(3) yxi + X2 — r)Xz = 0.
By introducing the designations
/4) u — _J v — ' 20 __ I '
equation (2) divided by f — 17 takes the form
where
The ends u, v, w in (4) should be called the Weierstrass homogeneous line
coordinates of the line £7] directed towards f , and (2*) the normal-form of the
equation of this line.
Per definitionem, the Weierstrass homogeneous line coordinates of
the line connecting the end TJ with Q and directed towards Q are to be
(4*) u = rj, v = 1, w = 7]
and further let (3) be the normal-form of the equation of this line. By
reversing orientation, the line coordinates are multiplied by (— 1) and the
equation multiplied with (—1) should be called the normal form.
HOMOGENEOUS COORDINATES IN HYPERBOLIC GEOMETRY 107
It may be easily shown, that every equation (2*) in which (5) holds for
the coefficients u, v, w is the normal- form of the equation of a certain directed
4. Transformation of the Weierstrass coordinates. Let us take beside
the right angle Q(9E that we have used in the definition of the endcalculus,
yet another right angle Q'O'E' where Q' and E' are ends. Consider the
congruence transformation of the plane into itself, that superposes the
right angle Q'O'E' on QOE. A certain directed line e should be transformed
into e' by this transformition. We mean by Weierstrass homogeneous line
coordinates of the directed line e with respect to the "coordinate-system"
Q'O'E' the ones of e' with respect to the original system QOE.
We define in a similar way the Weierstrass homogeneous coordinates of a
point P with respect to the coordinate-system Q'O'E'.
The connection of the new coordinates with the old ones can be con-
sidered first for the line coordinates, namely by making use of the fact,
that a congruence transformation of the plane into itself, transforms every
end f differing from Q into the end
t, = «e + P
(it transforms in case ofy--j£0 the end -- into Q and this latter into the end
7
— ), where the coefficients a, 8, y, d depend only on the new system Q'O'E'
7
and
ad — fly = ± 1
holds, according as a correspondence in the same or in the opposite sense is
involved [6]. 3
On the basis of this fact, we obtain, that the new line coordinates ex-
pressed in terms of the original ones are
u' — a\\u + #12^ +
(1)
v —
where the coefficients a^ depend only on the new system, and among which
3 For the proof see Gerretsen [2], Szasz [11], [12]
108
the relations
(2)
and
PAUL SZASZ
0n2
02i
2 -
03i
2 =
0122 + 0222 — 0322 =
0132 " 0232
033
2 —
011012 + 021022 — 031032 = 0
(3) • fl 12013 + 022023 — 032033 = 0
013011 + 023021 — 033031 — 0
are valid', further, the discriminant of this transformation is
(4)
D —
«2i 022 #23
031 032 033
From this follows by means of a simple consideration making use of
the inequality (5) of § 2, that the new coordinates of a certain point ex-
pressed in terms of the original ones are
(5)
xi = ± (an#i + ai2x2 + 013*3)
X2* = it (021*1 + 022*2 + 023*3)
*3' = it (031*1 + 032*2 + 033*3)
where the coefficients a^ are the same as those in ( 1 ) and the sign + or — is
valid, if the two coordinate systems have the same or the opposite sense,
respectively.
5. Distance of two points. The geometrical significance of the ex-
pression uiU2+viv% — w\w% for two lines. Distance of a point from a line.
Choosing the new coordinate-system suitably, it follows from the formulas
of the coordinate- transformation (§ 4, (5)), by means of the relations be-
tween the coefficients (§ 4, (2), (3)), that for the distance d of the points
(*i> *2> *s) 0W07 (#1, %2, #3) in the original system holds
(1) C(d) = #3*3 — *2*2 — *i*i.
This formula (1) discloses the simple geometrical significance of the
third coordinate #3 at once. Namely by taking as second point 0 the
coordinates of which are 0, 0, 1 (§ 2, (4)), formula (1) expresses that the
HOMOGENEOUS COORDINATES IN HYPERBOLIC GEOMETRY 109
third coordinate #3 of a point P, determined by the distance d = OP, is
(2) *3 = C(d).
Similarly, from the formulas of the transformation of the line coordi-
nates (§4, (1), (2), (3)) by taking the coordinate-system suitably, follows
in succession, that
(i) for the directed lines s\, s% intersecting each-other, one has
= T(a)
where a designates the distance with sign of the foot of the perpendicular
dropped from the end of s% falling in the positive direction, upon $1 (Fig. 10).
s2(u2.v2Jw2)
&
Fig. 10
(ii) for lines si, $2 possessing a common perpendicular and directed
equally one has
u\u% + V]V% ~ wiW2 = C(a)
where a signifies the piece of the common perpendicular between s\_ and so
(Fig. 11).
(iii) for parallel lines directed equally (Fig. 1 2) one has
From these theorems and the behavior of the functions C(t) and T(t)
follows, that the lines (u\, v\, w±) and (u<z, V2, w$ differing from each-other
1) meet if and only if
110
PAUL SZASZ
in particular they are perpendicular if, and only if,
= 0;
&
Fig. 11
n
.TL
&
Fig. 12
2) have a common perpendicular if, and only if,
HOMOGENEOUS COORDINATES IN HYPERBOLIC GEOMETRY 111
3) are parallel if, and only if,
Finally, it follows from a suitable choice of the new coordinate-system
of the same sense, and from the formulas with respect to line and point
coordinates together, that for the distance t of the point (x\, xz, #3) from the
directed line (u, v, w) one has
(3) S(t) = UXi + VX2 —
where t should be taken positive or negative, accordingly, as the point is on
the positive or negative side of the line.
Fig. 13
This theorem discloses the simple geometrical meaning of the first two
coordinates x\, x%. Namely, since the end E in the endcalculus is f = 1
and the other end of the line OE is rj = — 1 , therefore the line coordinates
of this line are — 1, 0, 0 (§ 3, (4)), thus for the signed distance — a of the
point (xi, X2, #3) from the line OE one has by formula (3), — x\ = S(— a),
or
(4) Xl = S(a).
Since moreover the line coordinates of the line OQ, directed towards Q,
are 0, 1, 0 (§ 3, (4*)), for the signed distance b of the point (x\, X2, #3)
from this latter line one has by (3),
(5) *2 = S(b).
1 12 PAUL SZASZ
Combining these results (4) and (5) with that of (2), we may state, that
if the distance of a point from the line OE is a, from OQ is b, from the point 0
is d, and we take the distance a on the right side of OE for positive (Fig. 1 3) ,
the distance b over OQ for positive as well, and both on the other side for
negative, then in the endcalcuhis with respect to the right angle QOE the
Weierstrass homogeneous coordinates of this point are
(6) *i = S(a), x* = S(b), x3 = C(d).
The methods of § 2-5 are those by means of which I have founded, on
the basis of the endcalcuhis of Hilbert, the analytic geometry of the
hyperbolic plane. In this way I have laid the foundation for a completely
elementary and at the same time independent construction of hyperbolic
plane geometry.
It is not difficult indeed, on the basis of the above exposition, to intro-
duce the homogeneous coordinates of points at infinity (viz. ends) and that
of ideal points, further to define the concept of the ideal line and that of
line at infinity analytically. The identity of hyperbolic plane geometry
with the well-known circle-model of Klein-Hilbert [5], [4, p. 38] emphatic-
ally independent of continuity, is already a consequence of this analytic
geometry.
To conclude we may mention that, by a result in J. C. H. Gcrrctsen [1J,
the axiom on the intersection of two circles can be derived from the
axioms of the hyperbolic plane referred to at the beginning of this
discussion. The analytic geometry of the hyperbolic plane outlined in
the present paper provides a new proof of this result (cf. Szasz [13]).
Bibliography
1 1] GERKETSEN, J. C. IT., Die Begriindiing der Trigonometric in der hyperbolischen
ICbcne. Konmklijke Nederlandsche Akademie van Wctenschappen, Pro-
ceedings of the Section of Sciences, vol. 45 (1942), pp. 360 -366, 479-483, 559
566.
[2] , Zur hyperbolischen Geometric. Konmklijke Nederlandsche Akademie
van Wetenschappen, Proceedings of the Section of Sciences, vol. 45 (1942), pp.
567-573.
[3] HILBERT, I)., Neite Begrunding der Rolyai-Lobatschefskyschen Geometrie.
Mathcniatische Annalcn, vol. 57 (1903), pp. 137-150.
[4] • , Grundlagen der Geometrie 7. Aufl., Leipzig and Berlin 1930, pp. 159-177.
HOMOGENEOUS COORDINATES IN HYPERBOLIC GEOMETRY 113
[5] KLEIN, F., Ober die Sogenannte Nicht-Euklidische Geometrie. Mathematische
Annalen, vol. 4 (1871), pp. 583-625, spec. pp. 620-621, (reprinted in Gesam-
melte Mathematische Abhandlungen I. Berlin 1921, pp. 254-305, spec. 300-
301.)
[6] LIEBMANN, H., Uber die Begriindung der hyperbolischen Geometrie. Mathema-
tische Annalen, vol. 59 (1904), pp. 110-128.
[7] SZASZ, PAUL, A Poincare"-fele felsik es a hiperbolikus sikgeometria kapcsolatdrol
(in Hungarian). A Magyar Tudomdnyos Akademia III. Osztalyanak Kozle-
menyci, vol. 6 (1956), pp. 163-184.
[8] , A remark on Hubert's foundation of the hyperbolic plane geometry. Acta
Mathematica Acadcmiac Scicntiarum Hungaricac, vol. 9 (1958). pp. 29—31.
[9] , Begrundung der analytischen Geometrie der hyperbolischen Ebene mit den
klassischen Hilfsmitteln, unabhdngig von der Trigonometrie dieser Ebene. Acta
Mathematica Academiae Scicntiarum Hungaricae, vol. 8 (1957), pp. 139-157.
[10] , Die hyperbolise he Trigonometrie als Folge der analytischen Geometrie der
hyperbolischen Ebene. Acta Mathematica Academiae Scicntiarum Hungaricae,
vol. 8 (1957), pp. 159-161.
[11] , Ober die Hilbertsche Begrundung der hyperbolischen Geometrie. Acta
Mathematica Academiae Scientiarum Hungaricae, vol. 4 (1954), pp. 243—250.
[12] , Unmittelbare Einfiihrung Weierstrasscher homogenen Koordinaten in der
hyperbolischen Ebene auf Grund der Hilbertschen Endenrechnung, Anhang. Acta
Mathematica Academiae Scientiarum Hungaricae, vol. 9 (1958), pp. 1-28,
spec. 26- -28.
[13] , New proof of the circle axiom for two circles in the hyperbolic plane by
means of the endcalciilm of Hilbert. Annales Universitatis Scientiarum Buda-
pestinensis de Rolando Eotvos nominatae, vol. 1 (1958), pp. 97-100.
Symposium on the Axiomatic Method
AXIOMATISCHER AUFBAU DER EBENEN ABSOLUTEN GEOMETRIE
FKIEDRICH BACHMANN
Mathematisches Seminar, Christian- Albrechts-Universitdt, Kiel, Deutschland
1. Absolute Geomctrie soil im Sinnc von J. BOLYAI als gemeinsamcs
Fundament der cuklidischen und der nichteuklidischen Geometrien
verstanden werden. Die Parallelenfrage, d.h. die Frage nach dem Schnei-
den oder Nichtschneiden der Geraden, wird offen gelassen.
Der Aufbau der ebenen absoluten Geometric, der hier skizziert werden
soil, besitzt besonderes Interessc durch die methodische Verwendung
der Spiegelungen. Anordenbarkeit und freie Beweglichkeit werden nicht
gefordert. Der Begriff der absoluten Geometrie wird so allgemein gefasst,
dass iiber alien Korpern von Charakteristik =|= 2, in wclchen nicht jecles
Element Quadrat ist, Modelle konstruiert werden konnen.
2. Gegebcn sei zunachst eine Menge von Punkten und eine Menge von
Geraden, und ferncr eine Inzidenz von Punkt und Gerade und ein Senk-
rechtstehen von Geraden, so dass die f olgcnden Axiome gclten :
INZIDRXZAXIOME. Es gibt wenigstens eine Gerade, itnd mit jeder Ge-
raden inzidiercn wenigstens drei Pnnkte. Zu zwei verschiedenen Punkten
gibt es genau eine Gerade, welche mit beiden Punkten inzidiert.
ORTHOGONALITATSAXIOME. Ist a senkrecht zu b, so ist b senkrecht zu a.
Senkrecht e Geraden haben einen Punkt gemein. Durch jeden Punkt gibt es zu
jeder Geraden eine Senkrechte, und wenn der Punkt mit der Geraden inzi-
diert, nur eine.
SPIEGELUNGSAXIOM (SCHUTTE). Zu jeder Geraden g gibt es wenigstens
eine Spiegelung an g, d.h. eine involutorische orthogonalitatserhaltende
Kollineation, welche alle Punkte von g festlasst.
(Eine eineindeutigc Abbildung einer Menge auf sich wird involutorisch
genannt, wenn sie ihrer Umkehr-Abbildung gleich, aber von der identi-
schen Abbildung verschieden ist).
Produkte von Geradenspiegelungen nennen wir Bewegungen.
Aus diesen Axiomen f olgt : Den Geraden a entsprechcn eineindeutig die
Spiegelungen aa an den Geraden, den Punkten A entsprechen einein-
114
AUFBAU DER EBENEN ABSOLUTEN GEOMETRIE 115
deutig die Punktspiegelungen a A, welche dual zu den Geradenspiegelungen
erklart seien. Ferner gilt:
A, b sind inzident ist aquivalent mit GA^I) ist involutorisch.
a, b sind senkrecht ist aquivalent mit oa<fb ist involutorisch.
Indem man die Punkte und Geraden durch die Punktspiegelungen und
Geradenspiegelungen, und ferner die gegebenen Relationen Inzidenz und
Senkrechtstehen durch die aquivalenten Relationen zwischen den Spie-
gelungen ersetzt, erhalt man daher in das Bewegungsgruppe ein iso-
morphes Abbild der gegebenen geometrischen Struktur. Dies gestattet,
geometrische Satze als Aussagen iiber Spiegelungen zu formulieren und
durch gruppentheoretisches Rechnen mit Spiegelungen zu beweisen. Der
Anwendung einer Geradenspiegelung ag auf die Punkte und Geraden
entspricht in dem isomorphen Abbild das gruppentheoretische Trans-
formieren aller Punkt- und Geradenspiegelungen mit der Geraden-
spiegelung ag.
3. Als Satz von den drei Spiegelungen bezeichnen wir die Aussage: Das
Produkt der Spiegelungen an drei Geraden a, b, c, welche mit einem Punkt
inzidicren oder auf einer Geraden senkrecht stehen, ist gleich der Spiegelung
an einer Geraden d.
Eine Gesamtheit von Punkten und Geraden, fur die die oben genannten
Axiome und der Satz von den drei Spiegelungen gelten, werde als me-
trische Ebene, und die Theorie dieser metrischen Ebenen als ebene absolute
Gcometrie bezeichnct.
4. Fiir den Aufbau der ebenen absoluten Geometric verwenden wir
— cntsprechend den Uberlegungen in 2 — statt der bisher genannten
Axiome ein Axiomensystem, welches die Bewegungsgruppen der metri-
schen Ebenen charakterisiert.
Wir fiihren zunachst einige gruppentheoretische Bezeichnungen ein.
Es sei eine beliebige Gruppe gegeben. Sind a, y Gruppenelemente, so
bezeichnen wir das Element y~1ay, das aus a durch Transformation mit y
hervorgeht, mit a*. Es ist (a/J)y = a^ und afv = (of)1?. Eine Menge von
Gruppenelementen nennen wir invariant, wenn sie gegen das Transfor-
mieren mit beliebigen Gruppenelementen abgeschlossen ist.
Es seien p, a involutorische Gruppenelemente. Besteht fiir sie die
Relation
(1) per ist involutorisch,
so schreiben wir hierfiir abkiirzend p\a. Offenbar ist (1) aquivalent mit
pa = ap und p 4= o. Wir schreiben pi,. . .,pmki>- - -,<Jn als Abkiirzung
116 FRIEDRICH BACHMANN
fur die Konjunktion der Aussagen pt\ajc (i—\, . . ., m\ k = 1, . . ., n).
5 (Gruppentheoretisches Axiomensystem der ebenen absoluten Geo-
metrie) .
GRUNDANNAHME. Es set ein aits involntorischen Elementen bestehendes,
invariantes Erzengendensystem S einer Gruppe G gegeben.
Die Elemente von 5 seien mil kleinen lateinischen Buchstaben be-
zeichnet. Die involutorischen Elemente aus G, welche als Produkt von
zwei Elementen aus 5 darstellbar sind, seien mit grossen lateinischen
Buchstaben (ausser G, H, S) bezeichnet.
AXIOM 1. Zu A, B gibt es stcts ein c mit A,B\c.
AXIOM 2. Aus A,B\c,d folgt A = B oder c = d.
AXIOM 3. Gilt a,b,c\E, so gibt es ein d, so dass abc = d ist.
AXIOM 4. Gilt a,b,c e, so gibt es ein d, so dass abc = d ist.
AXIOM 5. Es gibt a, b, c derart, dass a\b und wedcr c\a nock c\b noch c\ab
gilt.
Dies Axiomensystem ist einc reduzierte Fassung cines von ARNOLD
SCHMIDT angegcbenen Axiomensystems.
6 (Gruppencbene) . Ist Gm Bewegungsgruppe ciner metrischen Ebenc,
und Sm die Menge der Geradenspiegelungen, so geniigt das Paar Gm, Sm
dem gruppentheoretischen Axiomensystem.
Umgekehrt lasst sich jedem Paar G, S, welches dem gruppentheoreti-
schen Axiomensystem geniigt, einc metrische Ebene durch die folgende
Konstruktion der Gntppenebene zu G, S zuordnen :
Die Elemente a, b, ... (die Elemente aus S) werden Geraden, die Ele-
mente A, B, ... Punkte der Gntppenebene genannt. Zwei Geraden a und b
der Gruppenebene nennen wir zueinander senkrecht, wenn a\b gilt. (Die
Punkte sind also die Gruppenelemente, welche sich als Produkt von zwei
senkrechten Geraden darstellen lassen.) Einen Punkt A und eine Gerade b
der Gruppenebene nennen wir inzident, wenn A\b gilt. Axiom 1 besagt,
dass es zu zwei Punkt en stets eine Verbindungsgerade gibt. Axiom 2
besagt, dass zwei verschiedene Punkte hochstens eine Verbindungs-
gerade besitzen. Axiom 5 spricht eine Mindest-Existenzforderung aus und
besagt, dass es zwei senkrecht e Geraden a, b und eine Gerade c gibt, wel-
che weder zu a noch zu b senkrecht ist und auch nicht mit dem Punkt ab
inzidiert.
Wir definieren weiter: Drei Geraden a, b, c der Gruppenebene liegen im
AUFBAU DER EBENEN ABSOLUTEN GEOMETRIE 117
Btischel, wenn
(2) abc e S
gilt. 1st dies der Fall, gibt es also ein d mit abc = d, so nennen wir d die
vierte Spiegelungsgerade zu a, b, c. Axiom 3 und Axiom 4 besagen, dass drei
Geraden, welche mit einem Punkt inzidieren oder auf einer Geraden
senkrecht stehen, im Biischel liegen.
Durch das Axiomensystem ist zugelassen, dass es in S Elemente a, b, c
gibt, fur die abc = 1 ist. Dann sind die Geraden a, b, c der Gruppenebene
paarweise zueinander senkrecht. Wir sagen, dass drei solche Geraden ein
Polardreiseit bilden. (Polardreiseite treten bekanntlich in elliptischen
Ebenen auf). Ist abc = 1, also ab = c, so ist ab als involutorisches Pro-
dukt von zwei Elementen aus 5 ein Element C ; es ist also dasselbe Grup-
penelement sowohl Punkt als Gerade der Gruppenebene. Allgemein
nennen wir, wenn C — c ist, den Punkt C und die Gerade c der Gruppen-
ebene zueinander polar. Ist dies der Fall, so ist jede Gerade, welche mit
dem Punkt C inzidiert, zu der Geraden c senkrecht und umgekehrt; ist
namlich C — c, so gilt fiir alle x: Aus C\x folgt c\x, und umgekehrt.
Aus den Axiomen folgt :
EXISTENZ DER SENKRECHTEN. Zu A , b gibt es stets ein c mit A,b\c, d.h.
durch jeden Punkt gibt es zu jeder Geraden eine Senkrechte.
ElNDEUTIGKEIT DER SENKRECHTEN. Al4S A,b\C,d folgt A = b Oder
c = d, d.h. sind A, b nicht zueinander polar, so gibt es durch A nur eine
Senkrechte zu b. Sind insbesondere A, b inzident, so ist das in A auf b
errichtete Lot eindeutig bcstimmt und gleich Ab.
Die Spiegelung der Gruppenebene an einer Geraden c ist die Abbildung
(3) x* = x*9 X* = X*.
Auf Grund der Axiome 3 und 4 gilt fiir die Spiegelungen (3) der Satz von
den drei Spiegelungen. Die Bewegungen der Gruppenebene sind die Ab-
bildungen :
(4) x* = *y, X* = X? mit y e G,
also die inneren Automorphismen von G, angewendet auf die Menge der
Geraden und die Menge der Punkte.
Die Bewegungen (4) der Gruppenebene bilden eine Gruppe G*, welche
von der Menge S* der Spiegelungen (3) an den Geraden der Gruppenebene
erzeugt wird. Das Zentrum von G besteht nur aus dem Einselement. Das
118 FRIEDRICH BACHMANN
Paar G*, 5* ist eine Darstellung des axiomatisch gegebenen Paarcs G, 5.
7. Satze der absoluten Geometrie werden nun durch gruppentheore-
tisches Rechnen mil den involutorischen Elementen a, b, ... und A, B, . . .
bewiesen. Es gibt mancherlei einfache Beweise dieser Art.
Als Beispiel betrachten wir den Satz von der isogonalen Verwandtschaft
in bezug auf ein Dreiseit a, b, c. Er kann folgendermassen formuliert
werden : Sind a', b' , c' Geraden, welchc im Biischel liegen, und liegen b, a', c
sowie c, b', a sowie a, c', b im Biischel, so liegen auch die vierten Spie-
gel ungsgeraden ba'c = a", cb'a — b" , ac'b = c" im Biischel.
SATZ VON DER ISOGONALEN VERWANDTSCHAFT. A US ba'c — a",
cb'a = b", ac'b = c" und a'b'c' e 5 folgt a"b"c" e S.
BEWEIS. Es ist a"b"c" = ba'c- cb'a- ac'b = (a'b'c')». Aus a'b'c' eS
folgt (a'b'c')b E S, wegen der Invarianz von S, und damit a"b"c" e S.
Zu der dreistelligen Relation (2), durch die das Im-Buschel- Liegen von
Geraden erklart ist, bemerken wir:
Wegen der Invarianz von S ist die Relation (2) reflexiv und symmetrisch
in dem folgenden Sinne: Fur Elemente a, b, c, die nicht samtlich ver-
schieden sind, gilt (2) stets; gilt (2) fur Elemente a, b, c, so auch fiir jede
Permutation von a, b, c. Aus dem Axiomensystem der absoluten Geo-
metrie folgt, dass die Relation (2) auch transitiv ist, d.h. der
TRANSITIVITATSSATZ. Aus a 4= b und abc, abd e S folgt acd £ 5.
Niitzlich fiir das Beweisen in der absoluten Geometrie sind Lemmata
iiber nicht notwendig involutorische Elemente aus G, wie die folgenden :
LEMMA VON THOMSEN. a und ft seien Elemente aus G, welche als Pro-
dukte einer ungeraden Anzahl von Elementen aus S darstellbar sind. Ist
a =|= l und oft = a"1, so liegt a oder ft in S.
\ft\ fa fa LEMMA VON DEN NEUN INVOLUTORISCHEN PRODUKTEN.
a o o ~ $ind *i> Pic^G (i, k = 1, 2, 3) und OL\ 4= «2, fti =t= fa> so
a gilt: Steht an den acht mit ° bezeichneten Stellen der Pro-
a o o # dukttafel der a^^ ein Element aus S, so auch an der mit *
bezeichneten Stelle.
Aus dem Lemma von THOMSEN erhalt man durch Einsetzung z.B. den
Hohensatz, aus dem Lemma von den neun involutorischen Produkten
z.B. den HESSENBERGschen Gegenpaarungssatz (Vierseitsatz), mit dem
sich der Satz von PAPPUS gewinnen lasst.
Als Beispiel sei etwa der Beweis des Hohensatzes hier ausgefiihrt. Wir
AUFBAU DER EBENEN ABSOLUTEN GEOMETRIE 119
betrachten ein Dreiseit, welches kein Polardreiseit ist und dessen Seiten
nicht im Biischel liegen. Unter einer Hohe verstehen wir eine Gerade,
welche auf einer Seite des Dreiseits senkrecht steht und mit den beiden
anderen Seiten im Biischel liegt.
HOHENSATZ. Ist abc 4= 1 und abc $ S und gilt:
(5) u\a, v\b, w\c,
(6) bcu, cav, abw e S,
so ist uvw £ S.
BKWEIS. Nach der ersten Voraussetzung (5) ist ua = au, also au = a,
und nach der ersten Voraussetzung (6) bcu — ucb, also (bc)u — cb, ins-
gesamt also (abc)u — au(bc)u — acb. Indem man den Schluss wiederholt,
erhalt man
(abc)uvw =-- (au(bc)u)vw = (acb)vw = ((ac)vbv)w = (cab)w = cw(ab)w=cba.
Es ist also
(abc)u™ = (abc)-1,
und hieraus folgt wcgen abc =h 1 und abc $ S nach dem Lemma von
THOMSEN die Behauptung.
8 (Geradenbiischel) . Da die dreistellige Relation (2), wie in 7 bemerkt,
reflexiv, symmetrisch und transitiv ist, definiert sic in S Teilmengen mit
den Eigenschaften : 1) Fur je drei Elemente a, b, c einer Teilmenge gilt
(2) ; 2) Besteht zwischen zwei verschiedenen Elementen a, b einer Teil-
menge und einem Element c die Relation (2), so gehort auch c der Teil-
menge an; 3) Zu jc zwei Elementen a, b gibt es eine Teilmenge, der sie
angehoren. Aus 1), 2), 3) folgt: Je zwei verschiedene Teilmengen haben
hochstens ein Element gemein.
Diese durch die Relation (2) definierten Teilmengen der Menge aller
Geraden nennen wir Geradenbiischel . Je zwei verschiedene Geraden a, b
bestimmen ein Geradenbiischel; es besteht aus alien Geraden c, die mit
a, b im Biischel liegen.
Alle Geraden, welche mit einem gegebenen Punkt A inzidieren, bilden
ein Geradenbiischel, das wir mit G(A) bezeichnen. Solche Geradenbiischel
nennen wir eigentliche Geradenbiischel.
Alle Geraden, welche auf einer gegebenen Geraden a senkrecht ste-
hen, bilden ein Geradenbiischel, das Lotbiischel zu a, das wir mit G(a)
bezeichnen.
9 (Halbdrehungen). Ein weiteres Hilfsmittel fur Oberlegungen in
120 FRIEDRICH BACHMANN
der absoluten Geometric sind gewisse Abbildungen, welche keine Bewe-
gungen sind, namlich die von HJELMSLEV eingefiihrten Halbdrehungen.
Jedes Element a aus G, welches als Produkt einer ungeraden Anzahl
von Element en aus S darstellbar ist, lasst sich in der Form
(7) abc mit a\b,c
darstellen. Ist a + 1 > so bestimmt a das in der Darstellung (7) auftretende
Element a, das wir mit [a] bezeichnen, eindeutig.
Es sei nun y ein nicht-involutorisches Element aus G, welches als Pro-
dukt von zwei, mit einem Punkt 0 inzidierenden Geraden darstellbar ist.
Durch x -> [xy] wird eine eineindeutige Abbildung der Menge der Ge-
raden der Gruppenebene in sich definiert. Diese Abbildung nennen wir die
Halbdrehung um 0, welche zu dem Gruppenelement y gehort, und bezeich-
nen sie mit Hy. Es ist also
(8) xHy - [xy].
Die Halbdrehungen sind biischeltrcu: Liegen drei Geraden im Biischel,
so liegen auch ihre Bildgeraden im Biischel, und umgekehrt. Insbesondere
bildet jede Halbdrehung um 0 die Menge der Geraden durch 0 einein-
deutig auf sich ab. Senkrechte Geraden werden im allgemeinen nicht in
senkrechte Geraden ubergehen, wohl aber dann, wenn eine der beiden
Geraden durch 0 gcht.
Jede Halbdrehung induziert eine eineindeutige Abbildung der Menge
der Geradenbiischel auf sich; dabei wird die Menge der eigentlichen
Geradenbiischel in sich abgebildet. Wir nennen auch diese Abbildung der
Geradenbiischel eine Halbdrehung und werden sie mit dcm gleichen
Symbol bezeichnen, wie die Halbdrehung der Geraden, durch die sie
induziert wird. Die Menge der Lotbiischel der Geraden durch 0 wird
durch jede Halbdrehung um 0 auf sich abgebildet ; jedes andere Geraden-
biischel kann durch eine geeignete Halbdrehung um 0 in ein eigentliches
Geradenbiischel iibergefuhrt werden.
10. Aus gewissen Satzen der absoluten Geometrie entstehen bei ge-
wissen Ersetzungen von Punkten durch Geraden oder von Geraden
durch Punkte wieder richtige Satze der absoluten Geometrie. Ein Bei-
spiel fur diese ,,Punkt-Geraden-Analogie", auf die ARNOLD SCHMIDT auf-
merksam gemacht hat, sind Axiom 3 und Axiom 4 ; weitere Beispiele sind :
Zu A, B gibt es stets ein c mit A,B\c
(Existenz der Verbindungsgera-
den),
Zu A, b gibt es stets ein c mit A,b\c
(Existenz der Senkrechten).
AUFBAU DER EBENEN ABSOLUTEN GEOMETRIE
121
Aus A,B\c,d folgt A = B oder c=d
(Eindeutigkeit der Verbindungsge-
raden) .
Aus A,b\c,d folgt A = b oder c — d
(Eindeutigkeit der Senkrechten).
Ersetzt man in den rechts stehenden Satzen auch den Punkt A durch
eine Gerade a, so erhalt man die Aussagen
V Zu a, b gibt es stets ein c mil a,b\c,
d.h. je zwei Geraden haben ein gemeinsames Lot, und
^R Aus a,b\cyd folgt a — b oder c = d,
d.h. zwei verschiedene Geraden haben hochstens ein gemeinsames Lot.
Die Aussage ~R ist die Negation der Aussage
R Es gibt a, b, c, d mil a,b[c,d und a ^= b und c =4= d,
welche besagt, dass ein Rechtseit existiert.
Keine von den Aussagen V, • — 'R, R ist aus den Axiomen der absoluten
Geometric beweisbar. Man kann jede von ihnen als ein Zusatzaxiom zu
dem Axiomensystem aus 5 hinzufiigen und so Spezialfalle der absoluten
Geometric definieren. Die Aussage V ist mit der Existcnz von Polar-
drciseiten Equivalent und definiert die elliptische Geometric im Rahmen
unseres Axiomensystems der absoluten Geometric. Die Aussage R nennen
wir das Axiom der euklidischen Metrik, die Aussage ^R das Axiom der
nichteuklidischen Metrik. Die Zusatzaxiome R und ^R fiihren zu der
Gabelung der absoluten Geometric in die Geometric mit euklidischer Metrik
und die Geometric mit nichteuklidischer Metrik. Aus V folgt /-^R.
Ein allgcmeincs Theorem, welches den Umfang der in der absoluten
Geometric crlaubten Ersetzungen von Punkten durch Geraden und von
Geraden durch Punkte beschreibt, ist nicht bekannt. Jedoch sind in der
durch das Zusatzaxiom V definierten elliptischcn Geometric beliebige
Ersetzungen dieser Art erlaubt.
11 (Projektiv-metrische Ebenen). Unter einer projektiven Ebene ver-
stehcn wir eine Menge von Punkten und Geraden, in der die projektiven
Inzidenzaxiome, der Satz von PAPPUS und das FAN o- Axiom gelt en.
Eine projektive Ebene, in der eine Gerade als ,,unendlichferne" Gerade
goo und auf ihr eine projektive fixpunktfreie Involution als "absolute"
Involution ausgezeichnet ist, nennen wir eine singuldre projektiv-metrische
Ebene. Jeder Geraden a =(= goo ordnen wir einen Pol zu, namlich den auf
goo liegenden Punkt, welcher dem Schnittpunkt von a, g^ in der absoluten
Involution entspricht.
122 FRIEDRICH BACHMANN
Eine projektive Ebene, in der eine projcktive Polaritat als ,,absolute"
Polaritat ausgczcichnet 1st, ncnncn wir eine ordindre profektiv-metrische
Ebene.
Es sei nun c eine Gerade eincr gegebenen projektiv-metrischen Ebene;
die Gerade c sei im singularen Fall von g^ verschieden und im ordinaren
Fall riicht mit ihrcm Pol inzidcnt. Dann nennen wir die harmonische Ho-
mologie, deren Achse die Gerade c und deren Zcntrum der Pol von c ist,
die Spiegelitng der projektiv-metrischen Ebene an der Geraden c. Die von
der Mcngc Spm dieser Spicgelungen an Geraden der projektiv-metrischen
Ebene erzeugte Gruppe Gpm nennen wir die Bewegungsgruppe der pro-
jektiv-metrischen Ebene.
12 (Idealebene). Die Gruppenebene zu G, S lasst sich durch Ein-
fiihrung von idealen Elementen zu einer projektiv-metrischen Ebene
erweitern.
Man nennt hierzu die Geradenbiischel Idealpunkte, und die eigentlichen
Geradenbiischel eigentliche Idealpunkte. Die Menge aller Geradenbiischel,
welche cine Gerade a gemein haben, bezeichnet man als die eigentliche
Idealgerade g(a).
Urn den Begriff der Idealgeraden allgemcin zu definieren, verwenden
wir die Halbdrehungen, die es ermoglichen, ,,Uneigentliches" in ,,Eigent-
liches" uberzufiihren (vgl. 9).
Wir wahlen einen Punkt 0 der Gruppenebene, den wir fort an fest-
halten. Eine Halbdrehung Hy um 0 fiihrt jede eigentliche Idealgerade in
eine eigentliche Idealgerade iiber; denn es ist
(9) g(a)Hy = g(aHy).
Die Menge der Lotbiischel der Geraden durch 0, die bei jeder Halb-
drehung um 0 in sich iibergeht, bezeichnen wir mit g(0).
Eine Menge a von Idealpunkten wird nun eine Idealgerade genannt,
1) wenn es eine Halbdrehung Hy um 0 gibt, so dass aHy eine eigentliche
Idealgerade ist, und ferner 2) wenn a = g(0) ist.
Man beweist dann, dass die Idealpunkte und Idealgeraden eine pro-
jektive Ebene bilden, die Idealebene zu G, S. Die eigentlichen Idealpunkte
und die eigentlichen Idealgeraden bilden eine zu der Gruppenebene iso-
morphe Teilebene dei Idealebene.
Es ist nun zu zeigen, dass die in der Gruppenebene erklarte Orthogonali-
tat in der Idealebene projektiv-metrische Relationen induziert.
Wir nehmen zunachst an, dass in der Gruppenebene das Axiom der
euklidischen Metrik gilt. Dann sind je zwei Geraden, welche ein gemein-
AUFBAU DER EBENEN ABSOLUTEN GEOMETRIE 123
sames Lot haben, zueinandcr lotgleich, d.h. es ist jedes Lot der einen
Geradcn auch Lot der anderen Geraden. Daher ist jedes Lotbiischel auch
Lotbiischcl einer Geraden durch einen fest gewahlten Punkt. Die Menge
allcr Lotbiischel ist also eine Idealgerade, die wir mit g^ bezeichnen.
Gibt es in einem Lotbiischel eine Gerade, wclche zu einer Geraden
eines anderen Lotbiischels orthogonal ist, so ist jede Gerade des einen
Lotbiischels zu jeder Geraden des anderen Lotbiischels orthogonal. Es
gibt daher cine Orthogonalitat der Lotbiischel. Sie definiert auf der aus-
gezeichneten Idealgeraden g^ eine projektive fixelementfreie Involution.
Die Idealebene einer Gruppenebcne mit euklidischer Metrik ist also eine
singulare projektiv-metrische Ebene.
Es gelte nun in der Gruppenebene das Axiom der nichteuklidischen
Metrik. Dann ist jede Gerade nur zu sich selbst lotgleich, und die Lot-
biischel verschiedener Geraden sind verschieden. Ordnet man jeder
eigentlichen Idealgeraden g(a) den Idealpunkt G(a) (das Lotbiischel von
a) als Pol zu, so ist dies jetzt eine eineindeutige Zuordnung zwischen den
eigentlichen Idealgeraden und den Lotbiischeln. Um diese Zuordnung zu
einer in der gesamten Idealebene erklarten Polaritat auszudehnen, ver-
wendcn wir wiederum die Halbdrehungen um den Punkt 0. Wendet man
zunachst auf eine eigentliche Idealgerade g(a) eine Halbdrehung Hy um
0 an, so entsteht nach (9) die eigentliche Idealgerade g(aHy). Der Pol
G(a) von g(a) wird dabei im allgemeinen nicht wieder in den Pol G(aHy)
von g(aHy) iibergehen. Vielmehr besteht zwischen G(a) und G(aHy) der
f olgende allgemeine Zusammenhang : Es ist
(10) G(a)H--^ = G(aHv).
Wir nennen jedes Paar g(a), G(a) ein primitives Polar e-Pol-Paar und
definieren nun fur eine Idealgerade a und einen Idealpunkt A :
a, A heissen ein Polare-Pol-Paar, 1) wenn es eine Halbdrehung Hy um
0 gibt, so dass aHy, AH'1^ ein primitives Polare-Pol-Paar sind, und
ferner 2} wenn a = g(0), A = G(0) ist.
Man beweist nun, dass hiermit in der Idealebene eine projektive Pola-
ritat erklart ist. Die Idealebene einer Gruppenebene mit nichteuklidischer
Metrik ist also eine ordinare projektiv-metrische Ebene.
13. Die Spiegelung (3) der Gruppenebene an einer Geraden c induziert
die Spiegelung der projektiv-metrischen Idealebene an der eigentlichen
Idealgeraden g(c). Die Bewegungen (4) der Gruppenebene induzieren
daher Bewegungen der projektiv-metrischen Idealebene. Damit ergibt
sich nun das
124 FRIEDRICH BACHMANN
HAUPTTHEOREM. Jedes Paar G, S, welches dem gruppentheoretischen
Axiomensystem aus 5 geniigt, Idsst sich als Teilsystem eines Paares Gpm,
Spm darstellen.
Anders gesagt: Die Bewegungsgruppen der metrischen Ebenen sind
als Untergruppen von Bewegungsgruppen projektiv-metrischer Ebenen
darstellbar.
14 (Metrische Vektorraume und orthogonale Gruppen). Sei Vs(K, F)
der durch eine symmetrische bilineare Form F metrisierte dreidimensio-
nale Vektorraum iiber einem Korper K von Charakteristik =\^ 2. Wenn in
dem metrischen Vektorraum Va(K, F) alle isotropen Vektoren im Radi-
kal liegen, wird die Form F nullteilig genannt.
Die eigentlich-orthogonale Gruppe 0^+(K, F) wird erklart als die Gruppe
aller lincaren Abbildungen des metrischen Vektorraumes V^(K, F) auf
sich, welche den Wert von F erhalten und die Determinante 1 haben.
Unter der Spiegelung des metrischen Vektorraumes an einem nicht-isotropen
eindimensionalen Teilraum T verstehen wir die involutorische lineare
Abbildung des metrischen Vektorraumes auf sich, welche jeden Vektor
des Teilraumes T festlasst und jeden Vektor des orthogonalen Komple-
mcnts von T in den entgegengesetzten ubcrfiihrt. Die Menge S$+(K, F)
aller dieser Spiegelungcn des metrischen Vektorraumes ist ein Erzeugcn-
densystem der Gruppe 0%+(K, F).
15. Jede projektive Ebene kann man als dreidimensionalen Vektor-
raum iiber einem Korper K von Charakteristik 4= 2 darstellen, indem
man die Geraden durch die eindimensionalen und die Punkte durch die
zweidimensionalen Teilraume des Vektorraumes darstellt. Jede projek-
tiv-metrische Ebene kann man in entsprechender Weise als metrischen
Vektorraum V%(Kt F) darstellen ; die Form F ist im singularen Fall vom
Rang 2 und nullteilig, im ordinaren Fall vom Rang 3. Die Spiegelungen
der projektiv-metrischen Ebene an den in 1 1 genannten Geraden lasscn
sich durch die Spiegelungen des metrischen Vektorraumes an den nicht-
isotropen eindimensionalen Teilraumen darstellen. Fur die Bewegungs-
gruppe der projektiv-metrischen Ebene gilt daher: Das Paar Gpm, Spm
kann dargestellt werden durch das Paar
(11) 0*+(K,F)t Ss+(K,F).
Das Haupttheorem gestattet daher, die Gruppen, welche das Axiomen-
system der absoluten Geometric erfullen, — anders gesagt, die Bewe-
gungsgruppen der axiomatisch gegebenen metrischen Ebenen — als
AUFBAU DER EBENEN ABSOLUTEN GEOMETRIE 125
Gruppen von orthogonalen Transformationen metrischer Vektorraume
darzustellen :
HAUPTTHEOREM, algebraische Fassung. Jedes Paar G, S, welches dem
Axiomensystem aus 5 geniigt y ist darstellbar als Teilsystem eines Paares (11),
wobei der Korper K von Charakteristik =(= 2 und die symmetrische bilineare
Form F vom Rang 2 und nullteilig oder vom Rang 3 ist.
16. Umgekehrt entstcht nun die Frage, welche Teilsystcme von
solchen Paaren (11) Modelle des Axiomensystems aus 5 sind. Hierzu sei
hicr folgendes gesagt :
FALL 1 : F vom Rang 2 und nullteilig (euklidische Metrik). In dicsem
Fall geniigt jedes Paar (11) dcni Axiomensystem. Gibt es in dem durch
F metrisierten Vektorraum zwei orthogonale Einheitsvektoren, so lasseri
sich in jedcm Paar (11) alle ,,/Aigeho'rigcn", dem Axiomensystem genii-
genden Teilsysteme algebraisch beschreiben. Dabei spielt dcr von den
Elcmcnten (1 + c2)"1 mit c E K erzeugte Teilring von K eine Rolle.
FALL 2: F vom Rang 3 und nullteilig (elliptische Metrik). Auch in diesem
Falle geniigt jedes Paar (11) dem Axiomensystem. Beispiele von ccliten
Teilsystcmen, welche dem Axiomensystem geniigen, sind bekannt; eine
allgemeine Charakterisierung scheint schwieriger als im Fall 1.
FALL 3: F vom Rang 3 und nicht nullteilig (hypcrbolische Metrik). In
diesem Fall geniigt kein Paar (11) dem Axiomensystem. Jedoch kann es
ein echtes Teilsystem S von Ss+(K, F) geben, so dass das Paar Os+(K, F),
S dem Axiomensystem geniigt. Wird F so normiert, dass die Determi-
nante von F diejenige Quadratklasse von K ist, dcr die 1 angehort, so
gilt: Ist K geordnet, und 5 die Menge der Elemente aus Sa+(K, F) mit
negativer Norm, so geniigt O%+(K, F), S dem Axiomensystem. Es sind
alle invarianten, und Beispiele nicht-in variant er Teilsysteme dcr Paare
(11) bekannt, welche dem Axiomensystem geniigen.
Bibliographic
BOLYAT, J., Appendix. Scientiam spatii absolute veram exhibens : a ve r itate aut falsitate
Axiomatis XI Euclidei (a priori hand unquam decide nda] independentem : adjecta
ad casum falsitatis, quadratura circuh geometrica. Maros-Vasarhely 1832.
WIENER, II., Die Zusammensetzung zweier endlicher Schraubungen zu einer einzigen.
Zur Theorie der Umwendungen. Ober geometrische Analysen. Ober geometrische
126 FRIEDRICH BACHMANN
Analysen, Fortsetzung. Uber die aus zwei Spiegelungen zusammengesetzten Ver-
wandtschaften. Obey Gruppen vertauschbarer zweispiegeliger Verwandtschaften.
Berichte iiber die Verhandlungen der Kgl. Sachsischen Gesellschaft dcr
\Vissenschaften zu Leij>zig. Mathcmatisch-naturwissenschaftliche Klasse. Band
42 (1890), S. 13-23, 71-87, 245-267; Band 43 (1891), S. 424-447, 644-673; Band
45 (1893), S. 555-598.
DEHN, M., Die Legendreschen Sdtze uber die Winkelsumme im Dreieck. Mathemati-
schc Annalen. Band 53 (1900), S. 404-439.
HESSENBERG, G., Neite Begri'tndung der Spharik. Sitzungsberichte der Berliner
Mathematischcn Gesellschaft. Band 4 (1905), S. 69-77.
HJKLMSLEV, J., Neue Begrimdung der ebenen Geometrie. Mathematische Annalen.
Band 64 (1907), S. 449-474.
SCHUR, F., Grundlagen der Geometrie. Leipzig 1909. X+192 S.
HJKLMSLEV, J., Einleitung in die allgemeine Kongruenzlehre. Uet Kgl. Danske Vi-
denskabcrnes Sclskab, Matematisk-fysiskc Mcddelelser. Band 8 (1929), Nr. 11.
Band 10 (1929), Nr. 1. Band 19 (1942), Nr. 12. Band 22 (1945), Nr. 6. Band 22
(1945), Nr. 13. Band 25 (1949), Nr. 10.
HKSSKNHKKG, G., Grundlagen der Geometrie. Berlin /Leipzig 1930. 143 S.
THOMSEN, G., Grundlagen der Elementaygeometrie in gritppenalgebraischer Bchand-
lung. Hamburger Mathematische Kinzelschnften. Heft 15. Leipzig/Berlin
1933. 88 S.
REIDKMKISTER, K,. Geometria proicttiva von euchdea. Kendiconti del Scminario
Mathcmatico della R. Universita di Roma. Serie TTI, volume 1, parte 2
(1934), p. 219-228.
BACHMANN. F.f Eine Begmndung dcy absoluten Geometrie in dev Ebene. Mathema-
tische Annalen. Band 113 (1936), S. 424-451.
SCHMIDT, ARNOLD, Die Dualitat von Inzidenz und Senkrechtstehen in der absoluten
Geometrie. Mathematische Annalen. Band 118 (1943), S. 609-635.
SPKRNEK, E., Ein gyuppentheoretischer Beweis des Satzes von Desargues in der abso-
luten Axwmatih. Archiv dcr Mathematik. Band 5 (1954), S. 458-468.
SCIIUTTE, K., Die Winhelmetrik in der affin-orthogonalen Ebene. Mathematische
Annalen. Band 130 (1955), S. 183-195.
Gyuppentheoyetisches Axiomensystem einer verallgemeinerten euhlidischen Geo-
metrie. Mathematische Annalen. Band 132 (1956), S. 43-62.
BACHMANN, F., Aufbau der Geometrie aus dem Spiegelungsbegriff. Die Grundlehrcn
der mathematischen Wissenschaften. Band 96. Bcrlin/Gottingen/IIeidelberg.
1959. XIV 4-312S.
[In dem an letzter Stelle gcnannten Buch ist dcr hier skizzierte axiomatische.
Aufbau der ebenen absoluten Geometrie durchgefiihrt.]
Symposium on the Axiomatic Method
NEW METRIC POSTULATES FOR ELLIPTIC n-SPACE
LEONARD M. BLUMENTHAL
University of Missouri, Columbia, Missouri, U.S.A.
1. Introduction. In its most general aspects, a distance space is formed
from an abstract set 5 by mapping the set of all ordered pairs of elements
of S into a second set, which may be a subset of 5. It is suggestive to call
the elements of 5 points, and the elements of the second set distances.
Distance spaces are particularized by specifying the distance sets and by
postulating properties of the mapping. If, for example, the distance set is
the class of non-negative real numbers, and the mapping that associates
with each pair p, q of elements of the set 5 the number pq is definite (that
is, pq — 0 if and only if p = q), and symmetric (pq — qp), the resulting
distance space is called semimetric. The class of metric spaces is obtained
by assuming, in addition, that if p, q, r e S, the associated distances pq,
qr, py satisfy the triangle inequality, Pq + qr ^ pr. For each positive
integer n, the classical spaces (euclidean, spherical, hyperbolic, and
elliptic) of n dimensions are metric spaces.
A given distance space £ is characterized metrically with respect to a
prescribed class of distance spaces when necessary and sufficient con-
ditions, expressed wholly and explicitly in terms of the distance, are formu-
lated in order that any member of the class may be mapped onto 27 in a
distance-preserving manner. A mapping of this kind is called a congruence.
It is clear that such, a metric characterization induces an axioniatization
of 27 in terms of the sole (geometric) primitive notions of point and distance
when the given class of comparison spaces is sufficiently general.
Euclidean spaces Rn were the first to be studied in this manner. In his
Zweite Untersuchung, Menger obtained metric postulates for euclidean
w-space by first solving the more general problem of characterizing
metrically subsets of Rn, with respect to the class of semimetric spaces
[6]. With this accomplished, the solution of the space problem follows
upon adjoining to the metric characterization of its subsets (with respect
to the class of semimetric spaces) those metric properties that serve to
distinguish the Rn itself among its subsets. It was noted by W. A. Wilson,
however, that though none of Menger's conditions for congruently
127
128 LEONARD M. BLUMENTHAL
imbedding an arbitrary semimetric space into the Rn can be suppressed,
the set of assumptions obtained by adjoining to those conditions the
properties that individualize the Rn among the subsets (needed to
characterize the whole Rn) can be very materially reduced [8]. Wilson's
reduction consists in replacing Menger's assumption that for every integer
k, (1 < k < n), each (k + l)-tuple of points of a semimetric space can be
congruently imbedded in Rn, by the much milder requirement that each
four points be imbeddable in R^. The crucial imbedding sets are thus
quadruples of points, regardless of the dimension of the euclidean space
being characterized.
The following comments concerning Wilson's contribution are perti-
nent. (1). In validating the sufficiency of his "four-point" property,
Wilson made use of Menger's imbedding theorems for (7e + 1) -tuples,
(1 < k < ^)- A simpler argument by the writer, using a weaker four-
point property, is quite independent of those results, and so solved the
space problem without any reference to the subset problem [1. pp. 123-
128J. (2). The four-point property of Wilson suggests numerous weaker
properties which have been investigated by the writer and others [2].
This paper is concerned with an investigation of weak four-point properties
that arise in the metric study of elliptic spaces.
2. First metric axiomatization of elliptic space. Metric postulates for
spherical and hyperbolic spaces, arising from their metric characteri-
zations with respect to the class of semimetric spaces, were established by
the writer in 1935 and 1937, respectively. * But the numerous metric
abnormalities of elliptic space rendered its investigation (in the purely
metric manner imposed by the program) a more difficult matter, and it
was not until 1946 that the first set of metric postulates for finite and
infinite dimensional elliptic spaces was obtained [3]. Chief among the
metric features of elliptic space that make inapplicable the methods used
in the metric characterizations of other classical spaces are the following.
(1) Distinction between congruence and superposability . Defining a
motion as a congruent mapping of a space onto itself, two subsets are
called superposable provided there is a motion that maps one onto the
other. In contrast to the other classical spaces, two subsets of elliptic
space may be congruent without being superposable.
(2) Distinction between "contained in" and "congruently contained in".
i See [1].
NEW METRIC POSTULATES FOR ELLIPTIC W-SPACE 129
In any of the classical spaces other than the elliptic, a subset that is
congruent with a subset of a subspace is actually contained in a subspace
of the same dimension. This is not the case in elliptic space.
(3) Dependence not a congruence invariant. An w-tuple of a space is
usually called dependent when it is contained in an (m — 2) -dimensional
subspace. With this convention, a dependent w-tuple of elliptic space may
be congruent with one that is not dependent.
(4) Non-linearity of the equidistant locus. The locus of points of the
elliptic plane that are equidistant from two distinct points consists of two
mutually perpendicular elliptic lines, and hence no subset contained in
two such lines forms a metric basis.
(5) Cardinality of the maximal equilateral set. The elliptic plane con-
tains six points with all fifteen distances equal. No equilateral septuple
exists in the plane or in elliptic three-space. The cardinality of the maximal
equilateral subset of elliptic w-space is not known for n > 3.
The following set of metric postulates for elliptic space (with positive
space constant r) was established in [3]. Let Er denote a distance space
containing at least two points.
POSTULATE I. Er is semimetric.
POSTULATE II. Er is metrically convex (that is, if a, c e Er, a =(= c,
Er contains a point b such that a =|= b =)= c and ab + be — ac) .
The point c is said to be between a and c, and the relation is symbolized
by writing abc.
POSTULATE III. The diameter of Er is at most nrj2.
POSTULATE IV. Er is metrically complete.
POSTULATE V. // p, q e Er> pq 4= nrjl, then Er contains points p*, q*
such that pqp*, qpq* subsist, and pp* = qq* = nr\2.
Two points with distance nr/2 are called diametral. If p e Er, p* or
d(p) will denote a diametral point of p ; that is, pp* — pd(p) = nr/2.
DEFINITION. Three points of Er (not necessarily pairwise distinct] are
LINEAR provided the sum of two of the three distances they determine equals
the third.
If pit pz, PS e Er let zl* denote the determinant ey cos(pipj/r)\, (i, /= 1 ,
2, 3), where every ey is 1, except that £23 = £32 = — 1-
A symmetric matrix (e#), e# = ejt = ± 1, eu = 1, (i, j = 1,2, . . . , m)
is called an EPSILON MATRIX.
130 LEONARD M. BLUMENTHAL
POSTULATE VI. Let po, pi, . . ., p* be any five pairwise distinct points
of Er with (i) two triples linear, and (ii) the determinant A* of three of the
points (one of which is common to the two linear triples) negative. Then an
epsilon matrix (f#) , (i, j — 0, 1 , . . . , 4) exists such that all principal minors
of the determinant |e# cos(pipj/r)\t (i, j = 0, 1, . . ., 4) are non-negative.
These postulates insure that all subspaces of Er (properly defined) of
finite or infinite dimensions are elliptic (that is, congruent with the
classical elliptic spaces with space constant r).
To axiomatize elliptic w-space En,r, for a given positive integer n, it
suffices to adjoin the following (local) postulate.
POSTULATE VII. The integer n is the smallest for which a point q$ of Er
and a spherical neighborhood U(qo) exist such that each n -f 2 points PQ, p\t
. . ., Pn+i of U(qo) have the property that if there is an epsilon matrix (ey)
such that no principal minor of the determinant !*?$/ cus(pipj)/r\, (i, j = 0, 1,
. . . , n + 1 ) is negative, then an epsilon matrix (ey) exists such that no
principal minor of \ey cos(pipj/r)\, (i, j = 0, 1, . . . , n + 1) is negative, and
the determinant vanishes.
Interpreted geometrically, Postulate VI asserts that each quintuple (of
a prescribed subclass of the class of all those quintuples of Er containing
two linear triples) is congruently imbeddable in an elliptic space with
space constant r. The condition zl* < 0 means that the perimeter of the
three points for which it is formed is less than nr and imparts a local
nature to the postulate. It is observed, moreover, that the specific
(elliptic) character of the space defined by Postulates I-VI is determined
by Postulate VI alone. In view of the discussion above of four-point
properties, it is natural to seek to replace the five-point property ex-
pressed in Postulate VI by simpler four-point properties. The suggestion
to do so, made in the concluding section of [3], was acted upon in the
(unpublished) Missouri doctoral dissertation of J. D. Hankins (supervised
by the writer) which provides the basis for the present contribution [5].
3. Classes of quadruples and corresponding four-point properties. The
following seven classes of semimetric quadruples of pairwise distinct
points play a role in what follows.
A semimetric quadruple PI, p^ pz, p* belongs to class
{Qi} if and only if it contains a linear triple,
{$2} if and only if p2pzP* subsists and p^pz = Pzp*,
{@3} if and only if p2pzp* subsists, pzpz = pap*, and the perimeter of
NEW METRIC POSTULATES FOR ELLIPTIC tt-SPACE 131
every three of the fourpoints is less than nr + e, where r and e are arbi-
trarily chosen positive constants,
{$4} if and only if pzpzp* subsists and pip2 = pip*,
{Qs} if and only if p2psp4 subsists, p2ps = pap*, and pip2 = pip*,
{QQ} if and only if pzpzpi subsists, pzps = 2£3£4, and p\pz = pips,
{Qi} if and only if the quadruple contains two linear triples.
Clearly {0,} C {&}, (» = 2, 3. . . . , 7) and {0,} C {0,}.
DEFINITION. ^4 semimetric space has the elliptic WEAK, FEEBLE, e-
FEEBLE, ISOSCELES WEAK, ISOSCELES FEEBLE, EXTERNAL ISOSCELES
FEEBLE four -point property if every quadruple of its points of class {Qi},
{Qz}> • • - > {Qs}' respectively, is congruently imbeddable in an elliptic space
with space constant r. The space has the ELLIPTIC STRONG TWO-TRIPLE
PROPERTY if each of its quadruples of class {Qi} is congruently imbeddable in
an elliptic line.
The writer has established elsewhere the following imbedding theorem. 2
THEOREM 3.1. A semimetric m-tuple p\, p%, . . ., pm is congruently im-
beddable in elliptic n-space EntT if and only if (i) pipj < 7irj2, (i, j — 1,2,
. . ., m), and (ii) there exists an epsilon matrix (?^), (i, j — 1, 2, . . ., m),
such that the determinant |e# cos(pipj)/r , (i, j — 1,2, . . ., m), has rank not
exceeding n + 1, with all non-vanishing principal minors positive.
With the aid of this theorem (a) conditions for the congruent imbedding
in elliptic space of each quadruple of the classes {Qi}, {Qz}, . . ., {$7} are
expressed in terms of the six distances determined by the quadruple, and
(b) if a quadruple of class {Qi}, (i = 1, 2, . . ., 6), is congruently im-
beddable in Entr, then it is congruently imbeddable in E%%r.
DEFINITION. A Er space is any space for which Postulates I-V are
valid.
The following sections investigate Er spaces that have one or more of
the four-point properties defined above.
4. Spaces Zr with the elliptic weak four-point property. Let Zr(w)
denote a Er space with the elliptic weak four-point property. It is proved
in [3] that the weak four-point property is possessed by spaces in which
Postulates I-VI are valid; that is, in the presence of Postulates I-V,
Postulate VI implies the weak four-point property. This section is
2 See [1], p. 208.
132 LEONARD M. BLUMENTHAL
devoted to showing that in the same environment, the weak four-point
property implies Postulate VI.
The following three theorems were either established in [3] by using the
weak four-point property (instead of Postulate VI) or their proofs are
immediate.
THEOREM 4. 1 . Each Zr(w) space is metric and every triple of points is
congruently imbeddable in E2,r>
THEOREM 4.2. Two distinct non-diametral points p, q of a Er(w) space
are endpoints of a unique metric segment (denoted by seg(p, q)).
COROLLARY. If p, q e Zr(w)} p'q' e Enj, pq = p'q' 4= nr/2, there exists
a unique extension of the congruence p, q & pf, q' to the congruence seg(/>, q)
', q'). 3
THEOREM 4.3. // p, q e Er(w), (0 < pq < nr/2) there is exactly one
point p* of Er(w) such that pqp* subsists and pp* — nr/2.
Now if p, q G Er(w), (0 < pq < nr/2) and p*, q* are the unique points
diametral to p, q, respectively, with pqp* and qpq* subsisting, the unique
metric segments seg(/>, q), seg(^, />*), seg(/>*, q*), seg(<7*, p) have pairwise
at most endpoints in common and it follows that the two metric segments
q, p*} = seg(£, q) + seg(?, p*),
seg(/>, q*, p*) = seg(/>, q*) + seg(?*, p*),
have only p, p* in common.
DEFINITION. // p, q e Er(w), (0 < pq < nr/2), then seg(£, q, p*) -\-
seg(^), q*, p*) is called a one-dimensional subspace Erl(p, q) of Er(w),
with base points p, q, where pqp* and qpq* subsist.
THEOREM 4.4. A one-dimensional sitbspace Erl of Er(w) is congruent
with the elliptic line £i,r-
PROOF. Erl = seg(£, q, p*) + seg(^>, q*t p*), where p, q are base
points of Ef1. It follows from the weak four-point property that points
a, b, a*, b* of an elliptic line E\,r exist such that p, q, p*, q* ^ a,b,a*, 6*,
3 The notation pi, p2, . . ., pk ** qi, qz, . . ., qk signifies that pipj — qiqj, (i, j= 1,
2, . . ., k). The symbol " «*" is read, "is (are) congruent to".
NEW METRIC POSTULATES FOR ELLIPTIC W-SPACE 133
and the two congruences
(*) seg(£, q, p*) ** seg(0, b, a*),
(**) seg(£, q*, p*) * seg(a, &*, a*),
map Erl(pt q) onto E\>r(a, b). To show the mapping is a congruence it is
clear that only the two following cases need be examined in detail.
Case 1 . x E scg(/>*, q*), p* =\^ x =%= q*,y E seg(^, p*), ^ 4= y 4= £*. From
</y/>* and qp*q* follows yp*q*, and in a similar manner yp*x subsists.
Hence xy = xp* + p*y, and letting x', y' correspond to x, y by the con-
gruences (**), (*), respectively, the same considerations establish x'y'=
x'a* + a*y'. Since xp* = x'a* and p*y = a*y', then xy — x'y'. 4
Case 2. x E scg(</, p*), q =^ x 4- #*, y e seg(A <7*)> ^ 41 y =f= 5'*- Since
</*/>*, qp*q* imply xp*q*t and />y</*, pq*p* imply yq*p*, the quadruple
x, p*t q*, y contains two linear triples and hence points x" ', y", />", q" of
/£],r fxist such that ,r", y", ^>", </" ^ A;, y, ^>*, ^*. Since x't a*, b* z& x, p*,
q* & x", p", q", a motion G of £i,r onto itself exists with G(x", y", p", q")
— (x',y,a*tb*). But ya* =-- y"p" ~— yp* = y'a*. and yb* = y"q" =
yq* = y'b*. It follows that y = y7 (since a*b* =£ jrr/2) and so xy — .r'y'. 5
LKMMA 4.1. 7/ s, / G lir1 (0 < st < nr/2), then scg(s, /) C ^r1.
The proof may be taken from [3].
LKMMA 4.2. Any pair of distinct points of Er(w) is contained in a
unique subspace Er^.
PROOF. If the pair is non-diametral, the result is proved as in [3]. Let
p, p* denote a diametral point pair of Er(w) and suppose q e Er(w) with
Pqp*. The unique subspace Erl(p, q) contains p, p*, and by Theorem 4.4,
Erl(p, q) ^ Eitr(pf, q'). Let E* denote any one-dimensional subspace of
Er(w) containing p and p*, and suppose x E E*, x =^ p, p*, q. Since there
are two linear triples in the quadruple p, q, x, p*, then p, q, x, p* ^ p",
q", x", d(p") of Eitr(p', q'), where p"d(p") = nr/2. A motion G exists such
that G(p", q", x", d(p")) = (pf, q', x, d(p')), and one of the relations
p'xq', q'xd(p'), d(p')xd(q')) d(q')xp' subsists, or x coincides with one of the
points p', q', d(p')) d(q'). But then x satisfies the corresponding relation
in the unprimed letters, and Lemma 3.1 yields E*CErl(p,q). Inter-
changing the roles of E* and Erl gives Erl(p, q) C E*.
4 Obvious modifications of the argument arc used in case x = q*, y=q, etc.
5 No difficulties arc encountered when x, y are not interior points of the segments
from which they are chosen.
134 LEONARD M. BLUMENTHAL
LEMMA 4.3. Two congruent triples pi, p%, pz and PI, p2 , Pz of E^>r
are superposable if (i) one of the distances ptpj(i, j = 1 , 2, 3) equals nr/2, or
(ii) A*(pi, P2, Pz) < 0.
PROOF. The proof is given in [3].
THEOREM 4.5. Let pi, p2, pz be three pairwise distinct points of Er(w)
with A *(Pi, p2, PS) < 0, and p\ , p% , Pz points of E^,r with PI, p^, pz z&
PI, pz> PS- The congruences
(1) Erl(pl,p2) ™Ei,r(pl',p2f),
(2) Erl(pl, pz) * Ei,r(pl, pz),
determine uniquely the congruence,
Er^pl, p2) + Erl(pl, pz) ** Ei,r(pl, P*) + El,r(Pl> #3').
PROOF. Since pi, p2, pz are congruently imbeddable in E^.r, and
A*(pi, p2, #3) < 0, it follows that A(pi, pz, #3) = |cos(^^)/r|, (i, j = 1,
2, 3) is non-negative, and no one of the distances Pipj(i, j = 1 , 2, 3) is
nrj2. Hence p\, p% and pi, pz are base points of one-dimerisional subspaces
Erl(Pi, p2) and Erl(pi, pz), respectively. The congruences (1), (2) in which
pi and pi (i = 1 , 2, 3) are corresponding points are unique. If A(pi, p2 ,pz)
= 0, then PS£ Erl(pi, p2) and so Erl(pi,p2) and Er[(pi, pz) coincide.
Similarly, E\,r(p\ ', pz) = E\tr(p\ , Pz) and the theorem follows from
Theorem 4.4.
If, now, A(p\, p2, pz) = A(pi', p2, pz) > 0, then (since A*(pi, pz, pz) <
0) the points pi ', p2f, pz' neither lie on an elliptic line, nor are they
congruent with points of a line, and so pi, p^, pz are not contained in any
Erl of Er(w). The congruences (1), (2) give a mapping of Erl(p\, p2) +
Erl(pi,pz) onto Eitr(pi',p2) + Ei,r(pi, pz'). To prove the mapping a
congruence, suppose x e Erl(pi, p2), y e Erl(pi, pz), and let x', y' denote
their corresponding points by congruences (1), (2), respectively.
CASE i. % E seg(£i, pz), pi 4= x 4= p2, y e seg(/>i, pz). The possibilities
y = pit y = pz offer no difficulties. Supposing that piypz holds, then
points pi", y", £2", pz" of E2,r exist with pi, y, p2, pz ** pi", y" , p2f, pz",
and since pi, p2, pz *** P\, PI, pz ^ pi", p2", PZ", with A*(pi, p2, pz) < 0
a motion G of £2,r exists such that G(pi", y" , p2", pz") = pi , y, p2, pz'-
It is easily seen that y = y' and hence p^y' — p^y.
Now from p\%pi follows the existence of points such that pi" t p2",
x", y" ^ PI, p2, x, y, where the first quadruple is in E^r) and p\" ', p2f, y"
NEW METRIC POSTULATES FOR ELLIPTIC tt-SPACE 135
are not necessarily those points (with the same notation) considered in
the preceding paragraph. From A* (pi, p2, />a) < 0 follows
arid so A*(pi, p%, y) < 0. This permits applying to the quadruple pi, p2,
x, y the argument applied above to pi, p%, p3, y ,and the congruence
PI, p2, x, y ^ pi, p2f, x', y' is obtained, yielding xy = x'y'.
CASK II. x<=seg(p2,di(pi)), p2 =N x =\= di(pi), y e seg(£i, p3, d2(pi))t
pi 4- y =4= d2(pi), where d\(pi), d%(pi) denote points of Erl(pi, p2),
Er*-(pit PS), respectively, that are diametral to p\.
Let {qj} be a point sequence of Erl(pi, p^} with the limit pi, and piq$2
(j = 1,2, . . .). An index m exists such that qmpi + piy + yqm < nr. For
setting k = nr/2 — piy > 0, and selecting m so that qmpi < &/2 gives
qmpi + piy + 3>ft» 5; 2(qmpi + £iy) <7ir — k. It follows that zl*(£i,
?m, ^ < 0.
The quadruple £1, p%, qm, y contains the linear triple pi, p$, y, and
consequently pi, pz, qm, y ^ pi", pz" , qm",y"> with the latter quadruple in
£2,r. By Case I, pi, pa, qm ^ pi, pz, qm', and since A*(pi, qm, y) < 0 for
each point y of seg(£i, p3, dz(pi)), then A*(PI", pB", qm")=A*(pi, p3, qm)
< 0, and a motion of E«2,r sending pi', p^", qm" into pi, £3', qm', re-
spectively, gives pi, pz, qm, y & pi, pa, qmf, y. The linearity of pi, p$, y
implies that of pi, p^ , y and pi', ps , y'. Consequently pi', pa, y &
pi, p3, y ^ pi, pz, y' implies y = y' and qmy ^= qm'y'.
Turning now to the quadruple PI, qrn, x, y, the linearity of pi, qm, x
together with the relations pi, qm, y & pi, qmf , y', A*(pi, qm, y) < 0,
permits applying the above procedure to obtain xy = x'y'.
The various cases arising from x and/or y coinciding with one of the
points pi, p2, ps, di(pi), d^(pi) are all easily handled, and we may con-
clude that
(3) seg(£i, p2, di(pi)) + seg(£i, p3, d2(pi)) f*
CASK III. x e Erl(pi, p2), y eErl(pi, pz), with p2pi%, PzPiy sub-
sisting, and A* (pi, p2, y) < 0, A* (pi, x, y) < 0.
The method used in Case I is readily applied to yield p$x = pz'x' and
p2y = p2'yr. Now pit p2, x, y & pi", p2ff, x", y", points of E%%r. The re-
lations pi, p2) y *s pi, p2, y', A*(pi', pj, y') = A*(pi, p2, y) < 0, p2, pi*,
pip2 4= nr/2 yield pi', p2". x" , y" ^ pi, p2f, x', y', and so xy = x'y'.
136 LEONARD M. BLUMENTHAL
Let 0i, 02 denote points that satisfy the conditions imposed above on
x, y, respectively. Using those points in place of p^, p%, and proceeding as
in Case II yields
(4) seg(£i, oi, di(pi)) + seg(£i, 02, d2(pi)) &
seg(K> 01', di(pi')) + seg^', 02', *2(£i'))-
ASSERTION. seg(#i, 0i, <*i(£i)) + seg(£i, £3> ^2(^1)) ^ seg(£i', 01 Vi(£i'))
PROOF. Suppose ^ e seg(0i, ^i(^i)), and y e seg(/>i, />a, d%(pi)). It is
easily seen that seg(/>i, 01) contains an interior point q, arbitrarily close
to pi, such that qp\ + p\y + yq < nr, and qpi + pip% -f ^>3<7 < rcr.
By (4), PI, q, 02 & />i', <?', 02', and /4*(/>i', q', 02) < 0 follows from
A* (pi, 0i, 02) < 0 and piqoi. The familiar procedure now yields PI, p%,
q, 02 ^ />i', ^3', q', 03' , points of £2,7-- Similarly, it is shown that />i, ^3, r/,
y ^ #1', #3', ^', y'- Finally, PI, q} x, y ^ pi", q", x" , y", points of E%tr,
since xqpi holds, and from pi, q, y ^ p\ , qf , y', A* (pi , q' , y') < 0,
pi'q'x', pi'q' ~-\- nr/2, it follows that pi, q, x, y ^ pi', q' , x', y' , and xy —
*y.
Cases not explicitly treated above are either trivial or are handled in a
similar manner.
THEOREM 4.6. Postulate VI is valid in Er(w).
PROOF. Let />o, PI, p2, ps, P* be any five pairwise distinct points of
Er(w) with A*(PQ, pi, p^} < 0, and each of the triples />o, PI, pa and
Po, P'2, p4 linear. Then by Theorem 4.5 the sum Er1(po, pi) + Eri(po, p2)
is congruently imbeddable in E^r, and since p%, p^ are elements of the
first and second summand, respectively, the five points po, PI, . . . , p$ are
congruently imbeddable in E^tr-
It follows from Theorem 3. 1 that the quintuple has the property stated
in the conclusion of Postulate VI.
THEOREM 4.7. Postulates I, II, III, IV, V, VI^, VII are metric postu-
lates for elliptic n-space, where Postulate VI w postulates the elliptic weak
four-point property. 6
5. Metric spaces with the elliptic feeble four-point property. The ob-
jective of this section is to show that if Postulate I be strengthened to
6 Postulate VI M, may be formulated to make Postulate III unnecessary.
NEW METRIC POSTULATES FOR ELLIPTIC W-SPACE 137
require metricity, then the class of quadruples assumed congruently
imbeddable in E^.r may be restricted to the proper subclass {Qz} of class
{Qi}- Whether this restriction may be made without strengthening
Postulate II is an open question. Let Er(f) denote a metric space with the
elliptic feeble fourpoint property in which Postulates II-V are valid.
The following theorems are easily established.
THEOREM 5.1. Each point triple of Er(f) is congruently imbeddable
in E2,r-
THEOREM 5.2. Two distinct non-diametral points of Er(f) are joined by
exactly one metric segment.
This follows from (1) the existence of at least one metric segment
joining each two distinct points of any complete, metrically convex,
metric space, (2) the uniqueness of midpoints for nondiametral pointpairs
of Er(f) (a consequence of the feeble fourpoint property, since such points
are unique in /i2,r), <ind (3) the fact that each segment of Er(f) is the
closure of the clyadically rational points of the segment.
COROLLARY. The congruence p, q m />', q', 0 < pq < nr 2, (/>', q'eE^r)
has a unique extension to the congruence seg(/>, q) ^ seg(/>', </').
REMARK. There is exactly one seg(p, q, />*), with />, q, />* e Er(f) and
pqp* subsisting.
LEMMA 5.1. // />, s, m, d(s) e Er(f) such that sd(s) = nr,'2, m is a mid-
point of s, d(s) (that is, sm — md(s) — (\)sd(s)) and pd(s) < nr/2, then
points p', s', m', d(s') of E^,r exist such that (p, s) -f seg(;w, d(s)) ^
(P' > s') + seg(w', d(s')), where (p, s), (/>', s') denote the sets consisting of the
points exhibited.
PROOF. If m\ denotes the unique midpoint of p, d(s), the feeble four-
point property gives m\, s, m, d(s) z& mi', s', m' , d(s/), with the latter
points in #2,r- Similarly, p, m\t d(s)y m ^ p", m"\y d(s"), m", points of
£"2,r, and since A*(m, m\, d(s)) < 0, a motion of E^tr yields p, mi, d(s),
m & p', mi, d(s'), m'. The theorem is proved by showing that the mapping
p «-» p't s <-> s', seg(m, d(s) ^ seg(m', d(s'))
is a congruence.
If x e seg(m, d(s)) then sx = sm + mx = s'm' + m'x' = s'x'. Since
p, mi, d(s)f s *& p", mi", d(s"), s", m\t s, d(s) ^ m\ , s', ^(s'), and sd(s) =
nr/2, Lemma 4.3 yields p, mi, d(s), s & p*, mi, d(sf), s'. Then p*, mi,
138 LEONARD M. BLUMENTHAL
d(s') x* p', mi, d(s'), mi'd(s') 4=^/2, and p* e E^r(mi', d(s')), (since
pm\d(s} holds), imply p* — p', and so />, m\t d(s), s ^ pf, m\ , d(sf), s'.
It suffices now to show that px = p'x', for x an interior point of
seg(w, d(s)). If W2 denotes the unique midpoint of m, d(s), the above
procedure is applied to obtain p, mi, m^, d(s) ^ p', mi, m% ', d(s')t where
m,2 is the midpoint of m', d(sf). A continuation of the process yields
p, mi, q, d(s) £& p', mi, q', d(s'), for each dyadically rational point q of
seg(w, d(s)), (that is, for each point q of seg(w, d(s) such that mq = y .
md(s), where y denotes any dyadically rational number). Then pq — p'q',
and since the set of all the points q is dense in the segment, continuity of
the metric gives px — p'x'.
LEMMA 5.2. Let p, s, m, d(s) denote pairwise distinct points of Er(f)
such that (1) sd(s) = nr/2, (2) m is a midpoint of s,d(s), and (3) xe
seg(s, m, d(s)) implies px < nrj2. Points p' , s', m', d(s') of Ez,r exist such
that
(p) + seg(s, m, d(s)) ** (p') + seg(s', m', d(s')).
PROOF. By Lemma 5. 1 , points pf, s', m', d(sf) of E^,r exist such that
seg(s, m, d(s)) & seg(s', m', d(s')), ps — p's', and px — p'x' if x e
seg(w, d(s)). The lemma is proved by showing that the mapping defined
by
P <• > p', seg(vS, m, d(s)) G& seg(s', m', d(s'))
is a congruence.
It suffices to prove that py = p'y', yEscg(s,m), s =^ y 4= m. Now
seg(s, m, d(s)) contains a point x such that px < px < nrf2 for every
point x of that segment. Let a — yir/2 — px , and subdivide seg(w, y) into
n + 1 equal subsegments by means of points q^ — m, qi, q^ . . . , qn+i^y,
such that qiqi+i < a, and qi-\qiqi-\\ subsists, (i = 1, 2, . .., n). If IE
scg(m, d(s)) with mt = mqi, then A*(p, m, t) < 0, and p, m, t f& p', m', t'
by the preceding lemma. It follows that p, m, t, q\ & p', m', t', qi, and so
pql =^ P'q' i. Since A*(p, m, qi) < 0, and p, m, q\ ^ p', m', qi , the above
procedure yields pqz = p'q' 2, and repeated application of the process
gives py = pqn+i = p'q'n+i.
LEMMA 5.3. // s, d(s), p, q denote four points of Er(f) with sqd(s),
sd(s) = nr/2, and pxi — pX2 = nr/2 for two distinct points xi, X2 of
scg(s, q, d(s)), then (p) + seg(s, q, d(s) ^ (p') + seg(s', q', d(s'))9 and p' is
the pole of seg(s', q', d(s')).
NEW METRIC POSTULATES FOR ELLIPTIC tt-SPACE 139
The proof, based upon the superposability of any two congruent triples
of E2,r with a pair of corresponding distances equal to nrj2t offers no
difficulty.
LEMMA 5.4. // s, m, d(s), p are four pairwise distinct points of Er(f)
such that (1) sd(s) = nr/2, (2) m is a midpoint of s, d(s), (3) PXQ = nri2 for
exactly one point XQ of seg(s, m, d(s)), then (p) + seg(s, m, d(s)) is congruent-
ly imbeddable in E%,r.
PROOF. If XQ is an endpoint of seg(s, m, d(s)), the labelling may be
selected so that XQ — 5. Then by Lemma 5.1 points p' , s', m' , d(s'} of
Eztr exist such that
(/>, s) + seg(w, d(s)) ** (p't s') + segK, d(s')).
Let y denote any interior point of seg(s, m). The procedure of Lemma 5.2
may be applied to show that py = p'y', and continuity of the metric
gives ps = p's' . The same argument applies in case XQ =^ s, m, d(s), and
the remaining case (XQ = m) is immediate from Lemma 5.1.
The preceding lemmas establish the following theorem.
THEOREM 5.3. Any subset of Er(f) consisting of the union of a point
and a segment joining two diametral points is congruently imbeddable in
£2,r-
Let Im denote the strengthened form of Postulate I.
THEOREM 5.4. Postulates Im, II, III, IV, V, VI/, VII are metric
postulates for elliptic n-space, where VI/ postulates the elliptic feeble four-
point property.
PROOF. It suffices to show that VI/ implies VI^. If />, q, s, t e Er(f)
with qst subsisting, then qt — nrj2 implies (p) + seg(<7, s, t) congruently
imbeddable in E^j (Theorem 5.3), and hence so are p, q, s, t. In case
qt ^= nr/2, then by Postulate V, Er(f) contains a point d(q) such that
qtd(q) and qd(q] = nr/2. Now s 6 seg(</, t) C seg(^, t, d(q)), and hence the
congruent imbedding in E^tr of (p) + seg(^, t, d(q)) implies that p, q, s, t
arc also imbeddable in E^,r.
6. Metric spaces with the elliptic c-feeble four-point property. This
section is devoted to showing that the class of quadruples assumed
imbeddable can be restricted to class {^3}, a proper subclass of {Q?}. Let
Er(e — f) denote a space satisfying Postulates Im-V, with every quadruple
of class {$3} congruently imbeddable in E^r.
140 LEONARD M. BLUMENTHAL
It is easily seen that Theorem 5.2, together with the Corollary and
Remark following it, are valid under the weaker assumption made in this
section.
THEOREM 6.1. The union of any segment of Er(e — /) joining a
diametral pointpair, and any point of the space is congruently imbeddable
in E2,r>
PROOF. Let p, s, d(s) be points of Er(s — /) with sd(s) = nr/2, and let
X — [x e seg(s, d(s)) \ px = nr/2]. Clearly, X is a closed set.
Case I. X is null. Then seg(s, d(s)) admits a partition into equal non-
overlapping subscgments so small in length that the perimeter of each
triple of points contained in any quadruple formed by p and three ad-
jacent points effecting the partition is less than nr. An argument similar
to that used in the proof of Lemma 5.2 may be applied.
Case II. X = (s) or A" = (d(s)). Select the labelling so that A^ =- (s),
and let t be an interior point of seg(s, d(s}). It is easily seen that points
/>', /', d(s'} of E2,r exist such that
(P) + seg(*, d(s)) * (P'} + seg(*', d(s')9
with p, t, ds fe* p', tf, d(s'). Extend scg(*', d(s')) to s' so that s't'd(s') and
s'd(s') — nr/2 subsist. The mapping defined by
p <-» p', seg(s, t, d(s)) t& seg(s', t', d(s'))
is easily seen to be a congruence.
Case III. X = (t), t<= seg(s, d(s)), s 4= t =\= d(s). If u, v e seg(s, d(s))
with sut and tvd(s) subsisting and ut = tv < 8/2, the perimeter of every
triple in the quadruple p, u, t, v is less than nr + e. Then points p' ' , u', v',
t' of E2,r exist with p, u, t, v & p', u', t', v' . We have seg(s, t, d(s)) ^
seg(s', t', d(s')), where the latter segment of E%%r contains seg(w', t' , v').
The congruence
(p) + seg(w, t, v) ** (pf) + seg(w', t', v')
is easily established and its extension to
(p) + seg(s, t, d(s)) *, (p'} + seg(s', tf, d(s'))
is proved by the method of Lemma 5.2.
Case IV. X contains at least two points. Then every point of seg(s, d(s))
belongs to X and the desired conclusion is immediate.
NEW METRIC POSTULATES FOR ELLIPTIC W-SPACE 141
THEOREM 6.2. Postulates Im, II, III, IV, V, VI(e - /), VII are
metric postulates for elliptic n-space, where VI (e — /) postulates the elliptic
B- feeble four-point property.
PROOF. It is clear from Theorem 6.1 that the space has the feeble
four-point property, and so the theorem follows from Theorem 5.4.
It is worth remarking that the argument used in establishing the basic
Theorem 6. 1 requires e to be positive. It seems likely, however, that the
theorem is valid if e be replaced by zero; that is, if the congruent im-
bedding in #2,r of all quadruples pt q, s, t with qs = st — ($)qt and the
perimeter of each triple of points less than nr, be assumed.
7. Metric spaces with the elliptic isosceles weak four-point property and
the elliptic strong two-triple property. This section is concerned with
spaces for which Postulates Im-V are valid and such that all quadruples
of classes {$4} and {Q?} are congruently imbeddable in E^.r- Denote the
space Er (i.w.t.t.).
The imbecldability in E%tr of quadruples of class {<2?} suffices to esta-
blish the following theorems and remarks. 7
THEOREM 7.1. Each two distinct non-diametral points of £r(i.w.t.t.)
are joined by a unique metric segment.
THEOREM 7.2. // p, q e £r(i.w.t.t.), 0<pq<nrj2, there is exactly
one point d(p) of the space with Pqd(p) and pd(p) — 7ir/2.
REMARK 1. If pqd(p), pd(p) = nrj2, there is a unique seg(/>, q, d(p)).
REMARK 2. The relations pqd(p)) qpd(°), pd(p) = qd(q) = nr!2, imply
pd(q)d(p).
THEOREM 7.3. A one-dimensional subspace of Er (i.w.t.t.) is congruent
with E\tT.
On the other hand, the proof of the following basic theorem makes no
direct use of the congruent imbedding of quadruples of class {(??}, but
uses only the imbedding of quadruples of class {<?4}.
THEOREM 7.4. Let p be any point and Erl any one-dimensional subspace
of Er (i.w.t.t.). The E^.r contains a point p1 and a line E\,r such that
(P) + Erl « (pf) + £i,r.
7 See [1], pp. 217-220.
142 LEONARD M. BLUMENTHAL
PROOF. The theorem follows from Theorem 7.3 in case p G Erl, and is
obviously valid if px = nr/2 for every point x of Erl. It may be assumed,
therefore, that if / denotes a foot of p on Erl, then 0 < pf < nr/2. Let
a, b be points of Erl such that afb and af = fb = 7rr/4.
ASSERTION. 77w ^oznl / is the only foot of p on seg(a, /, b).
If there were two additional feet /i, /2 of p on seg(#, /, 6), then the
E2,r contains points /i', /2', /', £' with /i, /2, /, /> ^ fi, /2', /', £'. But
/>'/!' = £'/2' -= £'/' and the linearity of /i', /2', />' imply # = p'f'=nr!2,
contrary to the above.
Suppose, now, that /i is a foot of p on seg(0, /, b), f ^= /i, and denote by
g the midpoint of /, f\. From the congruent imbedding of p, /, gt f\ in
E2,r follows pq — nrj2. Assume the labelling so that gfib or /i — b. If x is
interior to seg(«, /), then ^>/ < px < pq, and so a point y of seg(/, g)
exists such that px — Py. Similarly, a point z of seg(g, /i) exists such that
px = py — pz. Imbedding p, x, y, z in £2>r yields px ~ py — pz — nr/2,
and imbedding p, /, x} y in £"2jr gives pf = nr'2. Hence the Assertion is
proved.
Select seg(%, y) on Erl so that / is its midpoint, xy < nrj2, and
^,T + xy + py < nr. If />.v — Py, let the labels q, s replace x, y, respective-
ly, while in the contrary case, label so that px < py. In the latter event, a
point z of seg(/, y) exists such that pz = PX, z 4= /• Now let q be the label
oi ' x, and s the label of z. Then q, /, s, p ^ q', f, s', p', with the "primed"
points in E^,r and q'f's' holding. The proof is completed by showing that
the correspondence defined by
is a congruence, where the congruence exhibited in the correspondence
is an extension of g, f,S£&g',f',s'.Iixe Erl(q, /, s), its correspondent in
Ei,r(<If> I'' s') wiU be Denoted by "primes".
If g e scg(<7, /, s) it is easily seen that p, q, g, s ^ p', q', g', s', and
consequently
(*) (P) + seg(?, /, s) *, (p') + segfe', /', s').
It follows that /' is the foot of p' on seg(?', /', s'), and q'f = qf = fs=f's'.
In the triangles (p', /', s') and (pf, f, q') the angles <£ p'f's' and <£ p'f'q'
are right angles. Let x, y be points of Erl such that fqx subsists, qx <
min(aq, fq), xfy holds, with fy < fx, and px = Py.
Then pt x,ytff^ p" , x", y" , f", with the latter quadruple in £2ff.
NEW METRIC POSTULATES FOR ELLIPTIC n-SPACE 143
ASSERTION. The point /" is the foot of p" on £i,r(*", /", y").
From p"x" = px - py = p"y"t the foot of p" on Ei,r(x", /", y") is
either the midpoint m" of seg(*", y") or its diametral point d(m") on that
line. The latter alternative is easily seen to be impossible.
Now the midpoint m of seg(#, y} is a point of seg(</, s) ,and so pm<nr/2.
From the congruent imbedding in E^j of p, x, m, y it is seen that m" and
/" coincide, and the Assertion is established.
The equalities xf = x"f" = f'y" — fy follow, and letting x', y' denote
the points on E\tT(q' , f, s') corresponding to x", y", respectively, yields
x"f" = x'f = fyf = /"/', and p"f" = pf = p'f. The two elliptic right
triangles (p" , f", y"}, (pf, f', y') are congruent, and so py ~ p"y" — P'y' =
p'x'. Also p"y" — py = px, and consequently px = p'x'. Thus congru-
ence (*) can be extended to
(P) + seg(*. /, y) « (p') + seg(*', /', y'),
and since PZ < nr/2, z E seg(#, y), repetition of the procedure yields
(**) (P) + seg(a, /, b) ** (p'} + seg(a', /', b').
Let d(f) denote the point of Erl(a, /, b) that is diametral to /, and sup-
pose g is an interior point of seg(0, d(f)). vSince agb and pa = pb hold,
p, a, g, b & p", a", g", b"t points of /i2fr, and p" , a", b" ^ p, a, b ^
p' ', a', 6'. Since a'b' = jrr/2, the two triples are superposable, and a motion
gives p, a, g, b ^ p' , a', g*, b', with g* on E\j(a' , f, b') and a, 6, g* ^
a', ft', gr. Then either g' = g* or g* is the reflection of g' in a'. The latter
case is easily seen to be impossible. Hence p, a, g, b ^ p', a', g', 6' and
PS = £Y-
In a similar manner it is seen that pv = p'v', for v' an interior element
of seg(6, d(f))t and the theorem is proved.
It follows at once that £r(i.w.t.t.) has the elliptic weak four-point
property and the following axiomatization results.
THEOREM 7.5. Postulates lm, II, III, IV, V, VI (i.w.t.t.), VII are
metric postulates for elliptic n-space, where Postulate VI (i.w.t.t.) asserts
that every quadruple of classes {Q$} and {Qi} are congruently imbeddable in
E2,r.
8. Metric spaces with quadruples of classes {Qs}, {Qe}, {Q?} imbeddable in
Zs2,r. By virtue of the strong two-triple property (the imbedding of
quadruples of class {(??}), Theorems 7.1, 7.2, 7.3 of the preceding section
are valid here. Let Er* denote any metric space satisfying the demands of
144 LEONARD M. BLUMENTHAL
Postulates II-V, and such that each quadruple of its points belonging to
{Qs}> {Q$}> or {(M i§ congruently imbeddable in E%%r.
THEOREM 8. 1 . The set sum of any point and any one-dimensional sub-
space of Er* is congruently imbeddable in E^r.
PROOF. It suffices to consider the case of p e Er*, Erl C Er*, f a foot
of p on Erl, and 0 < pf < nr/2. It is not difficult to show that (1) / is
unique and (2) Erl contains at most one point g with pg = nrj2. If such a
point g exists, and /*, g* denote those points of Erl diametral to /, g,
respectively, let x\t x% be points of seg(/, g), seg(/, g*, /*), respectively,
such that fx\ < fx%. It may be shown that px\ < px%.
Let a, b be the two midpoints of /, /* on Erl, and suppose g G seg(/, a, /*).
Choose points x, y of Erl so that # e seg(a, /), y e seg(/, b), 2xy < fg,
px = pyt and A*(p, x, y) < 0. Assume that the midpoint m of x, y is
distinct from /. Points s, t of Erl exist such that s e seg(#, /), t E seg(/, y),
ps = pt and st = 2-ty. Then p, s, /, y ** p" , s" } t" , y", points of E2,r. Let
z be a point of Erl such that ts = 2-sz. Then it is easily seen that
p, s, ty z a& p" y s", t", z", and p"z" — p"y". Since p"y" — py = px and
px = pz, with z interior to seg(/, g), it follows that x = z. Hence xs ~ ty,
and m is the midpoint of s, t. Repeating this procedure a finite number of
times establishes m as the midpoint of a pair of points for which it is not a
betweenpoint. As a result of this contradiction, we conclude that pairs of
points in seg(,v, y) having / as their midpoint are equidistant from p.
Select points a', /', b', p' of E 2, r such that Er1(a, /, b) & E\,r(a' , f, b') is
an extension of a, /, b z& a', /', b' ' , and /' is the foot of p' on Ei>r(a', f' , b'},
with p'f = pf. The proof is completed by showing that the mapping
p~p', ETi(a, f, b) *, £lff(«'. /', b')
is a congruence.
THEOREM 8.2. Postulates Im, II, III, IV, V, VI*, VII are metric
postulates for elliptic n-space, where Postulate VI* asserts the congruent
imbedding in #2,r of all point quadruples of classes {Qs}, {Qs}, {(??}•
9. A fundamental unsolved problem. It is known that every semimetric
space is congruently imbeddable in the E^r whenever each 7 of its points
are so imbeddable [7] 8. This is stated by saying that the elliptic plane
has congruences indices {7, 0} with respect to the class of semimetric
8 Sec also [4], in which congruence indices {8,0} are proved.
NEW METRIC POSTULATES FOR ELLIPTIC W-SPACE 145
spaces. Since the E^,r contains an equilateral sextuple, the congruence
indices {7, 0} are the best (that is, for no integers m, k (m < 7) are indices
{m, k} valid). This result, together with Theorem 3.1, completely solves
the congruent imbedding problem for E^r and hence provides a metric
axiomatization for the class of subsets of E^,r- Metric postulates for the
7i2,r itself are obtained by adjoining any metric properties that serve to
distinguish the elliptic plane among its subsets, though, as observed
earlier in this paper, this approach to the metric characterization of a
space is likely to result in a redundant set of postulates.
Perhaps the most important unsolved problem suggested by this manner
of studying elliptic geometry is the determination of congruence indices for
En,r when n > 2. The problem for the E%tr is quite difficult, and the
methods employed there seem incapable of extension even to E^r.
Apparently an entirely different approach is needed. At the present time
not even any preliminary results concerning the problem for general
dimension have been obtained.
Bibliography
[1J BLUMKNTHAL, L. M., Theory and applications of distance geometry. The Claren-
don Press, Oxford 1953, XI f347 pp.
[2] , An extension of a theorem of Jordan and von Neumann. Pacific Journal of
Mathematics, vol. 5 (1955), pp. 161-167.
[3] , Metric characterization of elliptic space. Transactions American Mathe-
matical Society, vol. 59 (1946), pp. 381 400.
[4] , and KELLV, L. M., New metric-theoretic properties of elliptic space. Rcvista
de la Universidad Nacional cle Tucuman, vol. 7 (1949), pp. 81-107.
[5] HANKINS, J. D., Metric characterizations of elliptic n-space. University of Mis-
souri doctoral dissertation, 1954.
[6J MENGER, K., Untersuchungen iiber allgemeine Metrik. Mathematische Annalen,
vol. 100 (1928), pp. 75-163.
[7] SKIDEL, J., De Congrnentie-orde van het elliptische vlak. Thesis, University of
Leiden, 1948, iv -f 71 pp.
[8] WILSON, W. A., A relation between metric and euclidean spaces. American
Journal of Mathematics, vol. 54 (1932), pp. 505-517.
Symposium on the Axiomatic Method
AXIOMS FOR GEODESICS AND THEIR IMPLICATIONS
HERBERT BUSEMANN
University of Southern California, Los Angeles, California, U.S.A.
The foundations of geometry are principally concerned with elementary
geometry and in particular with the role of continuity. Although our
intuition relies on continuity, very large sections of euclidean and non-
euclidean geometry prove valid without continuity hypotheses.
This lecture deals with the foundations of metric differential geometry.
Continuity is taken for granted and the interest centers on the question
to which extent differentiability is necessary. In order to delineate the subject
more clearly we emphasize that we do not mean results like that of
Wald [15], who characterizes Riemannian surfaces with a continuous
Gauss curvature among metric spaces, because his point is not the
weakening or omission of differentiability properties but their replace-
ment by other limit processes.
Two major advances have been made in the indicated direction, one
— our principal subject — is due to the author and concerns, roughly,
the intrinsic geometry in the large of not necessarily Riemannian spaces',
most of this theory can be found in the book [3]. The second is the work
of A. D. Alexandrov and deals with surfaces either in E3 or with an abstract
intrinsic Riemannian metric. Much of this material is found in his book
[1], a brief survey in [6]. In both theories the tools created in order to do
without differentiability assumptions proved in many instances far
superior to the classical, in fact they yield a number of results which
remain inaccessible to the traditional methods, even when smoothness is
granted.
It also appears that a frequently followed procedure, which works, as it
were, from the top down by reducing differentiability hypotheses in
existing proofs, has, in general, very little chance of producing final
results.
The axioms for a G-space, [3, Chapter I]. Since we are interested in
metric differential geometry our first axiom is :
I. The space is metric.
146
AXIOMS FORGEODESICS AND THEIR IMPLICATIONS 147
We call the space R, denote points by small Roman letters, the distance
from x to y by xy. Since the concept of metric space has been generalized
in various ways we mention explicitly that we require the standard
properties: xx = 0, xy = yx > 0 if % ^ y, and the triangle inequality
xy + yz ^ xz. But large parts of the theory hold without the symmetry
condition xy = yx.
The relations x ^ y, y ^ z and xy + yz ~ xy will be written briefly as
(xyz). A set M in R is bounded if xy < p for a suitable /? and all x, y in M.
Our second axiom is the validity of the Bolzano Weierstrass theorem:
II. A bounded infinite set has an accumulation point.
In conjunction with the following axioms, II entails that the space is
complete -and behaves in all essential respects like a finite-dimensional
space. Whether the axioms actually imply finite dimensionality is an
open question.
The third axiom guarantees that the metric is intrinsic. It was intro-
duced by Menger as convexity of a metric space:
III. // x =£ z, then a point y with (xyz) exists.
It follows from I, II, III that any two points x, y can be connected by a
segment T(x, y), i.e., a set isometric to an interval [a, /?] of the real £-axis.
T(x, y) can therefore be represented in the form z(t), a<J</?=a+#;y with
(1) z(h}z(h) = \h - t2\,
and z(a) — x, z((i) = y.
These axioms are satisfied, for example, by a closed convex subset of a
euclidean space. However, we aim at geometries which cannot be ex-
tended without increasing the dimension. Obviously some form of
prolongability is necessary. Requiring that any segment can be prolonged
would be too strong, it would eliminate even the ordinary spherical
metric. If S(p, p) denotes the set of points x with px < p, we postulate:
IV. Every point p has a neighborhood S(p, pp), pp > 0, such that for
any two distinct points x, y in S(p, pp) a point z with (xyz) exists.
The generality of the function pp is deceiving; the axiom furnishes
a function p(p) > 0 satisfying IV and the Lipschitz condition
\P(P)-P(9)\ <P9-
The four axioms yield geodesies. A geodesic is a locally isometric image
of the real 2-axis, precisely: it can be represented in the form z(t),
148 HERBERT BUSEMANN
— oo < t < oo, and there is a positive function s(t) such that (1) holds
for \ti — t\ <* e(t), i — I, 2. Thus, the geodesies on an ordinary cylinder
are either entire helices, entire straight lines or circles traversed infinitely
often.
Geodesies exist in the following sense: a function z(t) satisfying (1) in an
interval a < / < ft, a < ft, can be extended to all real t, so that it re-
presents a geodesic. This is the analogue to the indefinite continuation of
a line element into a geodesic in the classical case.
Axioms I to IV contain no uniqueness properties. In an (xi, #2) -plane
metrized by xy = \xi — y\\ + \xz — yz\ any curve z(s) = (zi(s), £2(5)),
a < s < b, for which both zi(s) and zz(s) are monotone is a segment from
z (a) to z(b). We observe that in the classical case a segment can be pro-
longed by a given amount in at most one way and therefore postulate :
V. // (xyzi), (xyz%) and yz\ — yz2, then z\ = zz-
The five axioms guarantee that the above extension z(t] of a segment to
a geodesic is unique. Moreover, if (xyz) then T(x, y) and T(y, z) are unique
(because two different T(y, z) would yield two different prolongations of
T(x, y).). In particular, T(x, y) is unique for x, y e S(p, pp), so that the
local uniqueness of the shortest connection, which is so important for many
investigations in differential geometry, need not be explicitly stipulated.
The spaces satisfying the five axioms are called G-spaces, the G alluding
to geodesic.
There are two particularly simple types of geodesies, namely those
which satisfy (1) for arbitrary tit t2 and are therefore isometric to the
entire real axis; they are called straight lines. The others arc the so-
called great circles which are isometric to ordinary circles. A representation
z(t) of a great circle of length ft is characterized by
z(ti)z(h) = min \h -t2 + vft\.
\v\ 0,1,2,...
The cylinder shows that straight lines, great circles and geodesies which
are neither may occur in one space. When IV holds in the large, or z with
(xyz) exists for any two distinct points x, y, then all geodesies are straight
lines (and conversely), and the space is called straight.
The lowest dimensional G-spaces are uninteresting. A 0-dimensional
space is obviously a point and a one-dimensional G-space is a straight line
or a great circle. The two-dimensional G-spaces can be proved to be
topological manifolds; the corresponding problem for higher dimensions
is open.
AXIOMS FOR GKODESICS AND THEIR IMPLICATIONS 149
It is important to notice that the axioms comprise the Finsler spaces,
where the line clement has the form ds — f(x\, . . . , xn\ dx\, . . . , dxn) =
f(x, dx), and f(x, dx) satisfies certain standard conditions (see [3, Section
15]) but need not be quadratic or Riemannian ds2 = £gik(x)dxidxk. The
analytical methods often become highly involved for Finsler spaces. This
explains why the limitation of the hypotheses inherent to the axiomatic
approach leads in this case to improved methods, which have the ad-
ditional appeal of effecting a synthesis of differential geometry, topology,
the calculus of variations, the foundations of geometry, and convex body
theory.
Spaces with negative curvature, [3, Chapter V]. Tt is impossible to
outline the whole theory of G-spaces in the space available here. We
therefore4 restrict ourselves to giving a few typical results and discuss in
greater detail only the theory of parallels which is more closely related to
the remaining geometric topics of this symposium.
Hadamard discovered in [8] that the surfaces with negative curvature
have many beautiful properties. For Riemann spaces his results were ex-
tended by others in various directions. The very concept of curvature
seems to imply notions of differentiability. However, each of the following
two properties proves in the Riemannian case to be equivalent to non-
positive (negative) curvature:
For each point p there is a positive ov, 0 <. dp ^ pp such that
(a) if a, b, c lie in S(p, o^) but not on a segment, and b' , c' are the mid-
points of T(a, b) and T(a, c), then 2V c' < 2bc (2b'c' < 2bc) ;
(b) if C(T, f) denotes the set xT < e where T is a segment and C(T, E) <
S(p, dp) then C(T, e) is convex (strictly convex}.
Convexity is defined in the usual way by means of the T(x, y).
In Finsler spaces, hence also in G-spaces, (a) is stronger than (b). The
geometry discovered by Hilbert [9, Anhang I] (which corresponds to the
Klein model of hyperbolic geometry with a convex curve replacing the
ellipse as absolute locus) furnishes an example where (b) holds but not (a).
In fact, a Hilbert geometry satisfying (a) is hyperbolic (see Kelly and
Strauss [11]). The condition (b) was introduced by Pedersen [12].
Practically all of Hadamard 's and the later results on Riemann spaces
with non-positive or negative curvature hold for G-spaces with the property
of domain invariance satisfying (a), and many are valid when (b) holds. In
particular, (b) implies that the universal covering space of the given
150 HERBERT BUSEMANN
space R is straight. Consequently, for two given points of R, the geodesic
connection is unique within a given homotopy class. Moreover, when the
C(T, e) are strictly convex, then there is at most one closed geodesic
within a given class of freely homotopic curves; if compact, the space
cannot have an abelian fundamental group and possesses only a finite
number of motions (isometries of the space on itself). The latter result is,
for Riemann spaces, contained in Bochner [2] and for the general case in
[7]. The theory of parallels (see below) requires (a) instead of (b).
Characterizations of the elementary spaces, [3, Chapter VI]. For brevity
we call the euclidean, hyperbolic, and spherical spaces elementary. The
bisector B(a, a') of two distinct points a, a' in a G-space is the locus of the
points x equidistant from a and a' \ that is, ax = a'x. The elementary
spaces, with the exception of the 1 -sphere, are characterized by the fact that
their bisectors contain with any two points x, y at least one segment T(x, y).
The principal theorem is the following local version:
(2) Let 0 < 6 <, pp and assume that for any five distinct points a, a', b, c, x in
S(p, d) the relations ab = a'b, ac — a'c and (bxc) entail ax = a'x. Then
S(p, 6) is isometric to an open sphere of radius d in an elementary space.
The hypothesis means, of course, that b, c e B(a, a') implies x e B(a, a').
Whereas the proofs for the results on spaces with non-positive curvature
are not essentially longer than they would be under differentiability
hypotheses, such hypotheses would very materially shorten the long proof
of (2) in [3, Sections 46, 47].
Various well-known theorems are corollaries of (2) ; examples are the
following global and local answers to the Helmholtz-Lie Problem:
If for two given isometric triples a\t #2, #3 and a\ ', az, a% (i.e.,
of a G-space R a motion of R exists taking at into aj t i — 1 , 2, 3, then R is
elementary.
If every point of a G-space R has a neighborhood S(p, 6), 0 < d < pp,
such that for any four points a\, a^, a\ ', a% in S(p, d) with pa\ = pa\ ',
pa% = pad and a\a% = a\a<& a motion of S(p, d) exists which takes at into
at,i= 1, 2, then the universal covering space of R is elementary.
By using deeper results from the modern theory of topological and Lie
groups, Wang [16] and Tits [14] succeeded to determine all spaces with
the property that for any two pairs a\t a% and a\ , a% with a\a^ = a\a^ a
AXIOMS FOR GEODESICS AND THEIR IMPLICATIONS 151
motion exists taking at into a/, i = 1 , 2. If the space has an odd dimension
then it is either elementary or elliptic ; for even dimensions greater than 2
there are other solutions.
Inverse problems in the large for surfaces, [3, Sections 1 1, 33], [4], [13].
In inverse problems of the calculus of variations one gives a set of curves
and asks whether they occur as the extremals of a variational problem.
The local inverse problem in two dimensions was solved by Darboux, but
his method provides no answers in the large. Inverse problems in the large
cannot be treated by one method, they differ depending on the topo-
logical structure of the surface, and are inaccessible to the traditional
approach.
Three of these problems have been solved with the present methods.
We mention first that a G-space, in which the geodesic through two (distinct)
points is unique, is either straight, or all its geodesies are great circles of the
same length j$ (sec [3, Theorem (31, 2)]). In the latter case we say that the
space is of the elliptic type. If its dimension exceeds 1 , then it has a two-
sheeted universal covering space which shares with the sphere the
properties that all geodesies are great circles of the same length 2/?, and
that all geodesies that pass through a given point meet again at a second
point.
A two-dimensional G-space R, in which the geodesic through two
points is unique, is therefore cither homeomorphic to the euclidean plane
E2 or to the projective plane P2. If in the case of E2 a euclidean metric
e(x, y) is introduced, then the geodesies of R form a system N of curves
which have, in terms of e(x, y), the following two properties:
1) Each curve in N is representable in the form z(t), — oo < t < oo, with
z(h) J-- zfa) for ti ^ t% and e(z(Q)), z(t)) -> oo for \t\ -> oo.
2) There is exactly one curve of N through two distinct points.
The answer to the corresponding inverse problem is:
// a system N of curves in E2 with the properties 1), 2) is given, then the
plane can be remetrized as a G-space with the curves in N as geodesies.
It will become clear from examples later that the problem of deter-
mining all metrics with the curves in N as geodesies has too many so-
lutions to be interesting.
The inverse problem P2 was solved by Skornyakov [13], a simpler
proof is found in [4] :
152 HERBERT BUSEMANN
In P2 let a system N' of curves homeomorphic to a circle be given and such
that there is exactly one curve of N' through 2 given distinct points. Then
P2 may be metrized as a G-space with the curves in N' as geodesies.
The third problem solved with these methods is that of a torus with a
straight universal covering space. It differs from the preceding problems in
that there are non-obvious necessary conditions. In the plane as the
universal covering space of the torus we can introduce an auxiliary
cuclidean metric e(x, y) and cartesian coordinates x\t %% such that the
covering transformations are the translations T(WI, m%) :
%i = x\ -f m\, %2 — #2 + *«2, m\, wo integers.
To the geodesies on the torus there then corresponds in the plane a
system N of curves which satisfies the conditions 1), 2) above and in
addition the following:
3) N goes into itself under the T(mi, m%).
4) // a curve in N passes through q and qT(m\t m%) then it also passes
through the points qT(vm\t vm^), \v\ = 1,2, ....
5) N satisfies the parallel axiom (on its usual form, see below).
Whereas it is not hard to establish 4), the proof of 5) is far from obvious
and represents, as far as the author is aware, the only instance in the
literature where the validity of the parallel axiom appears as a non-
trivial theorem.
// a system N of curves in the plane is given which has the properties 1)
to 5), then the plane can be metrized as a G-space with the curves in N as
geodesies, where the metric is invariant under the T(m\, m$ and thus yields
a metrization of the torus.
The curves in N need not satisfy Desargues' Theorem, even when N and
the metric arc invariant under T(mi, r), where r is an arbitrary real num-
ber. These facts exhibit very clearly the great generality of Finsler spaces
as compared to Riemann spaces: the only Riemannian metrizations of the
torus such that the universal covering space is straight are euclidean (see E.
Hopf [10]).
The theory of parallels, [3, Chapter III]. In the foundations of geometry
the congruence axioms, parallel axiom (euclidean or hyperbolic), and
the continuity axioms usually appear in this order. The present theory
AXIOMS FOR GEODESICS AND THEIR IMPLICATIONS 153
suggests the study of parallelism on the basis of continuity without
congruence or mobility axioms.
In a straight space denote by G(a, b), a ^ b, the geodesic, briefly line,
through a and b and by G+(a, b) the same line with the orientation in
which b follows a. Let G+ be any oriented line, p any point. Then G+(p, x)
converges to a line A+t when x traverses G+ in the positive direction. The
convergence of G] (p} x) is trivial in the plane, but not in higher dimensions.
A + is called the asymptote to Gf through p. It is independent of p in the
sense that for q e A + the line G+(q, x) also tends to A+.
Denote by A~t G~ the opposite orientations of the lines A, G carrying
A '", G1 . If A '• and A~ are asymptotes to G+ and G~ respectively, then we
call A parallel to G. These definitions suggest investigating the following
properties:
SYMMETRY: // A + is an asymptote to G1, then G+ is an asymptote to A+.
If A is parallel to G, then G is parallel to A.
TRANSITIVITY: // A H is asymptote to B+, and B+ is to O, then so is
A+ to C+.
It is very easily seen that the transitivity of the asymptote relation implies
its symmetry. The converse holds in the plane, but it is not known whether
this extends to higher dimensions.
Even in a plane the asymptotic relation is not always symmetric. In an
(.v, y) -plane let H, HI be the branches x < 0 of the hyperbolas xy = — 1
and xy — 1 respectively. Let H~, H\~ be their orientations corresponding
to decreasing x. The system TV consists of all curves obtainable by trans-
lations from 77, of the lines y = mx + b, m < 0 and the lines x — const.
The system NI consists of the curves obtainable by translations from H
or HI and of the lines x = const., y = const. Each of the systems satisfies
the conditions l)-2) above and hence serves as system of geodesies for a
6-space.
Denote by Y+ and Y (_i the lines y = 0, y = — 1 with the orientations
corresponding to increasing y. In both systems N, NI, Y ' is an asymptote
to Y+_i and so is Y~ to Y~_i ; thus, Y is parallel to Y-I. In N the line
Y+-i is not the asymptote to Y+ through (— 1 , 1 ), but H+ is, whereas Y~-\ is
an asymptote to Y~. In the system NI neither Y+-I is an asymptote to Y+
nor is Y~_i to Y".
In the plane the parallel axiom in its usual form (namely, if p $ G then
there is exactly one line A through p which does not intersect G) is equivalent
154 HERBERT BUSEMANN
to postulating that for any p and G, if A+ is the asymptote to G+ through p,
then so is A~ to G~. The uniqueness of the non-intersecting line implies
symmetry and transitivity.
The usual formulation of the corresponding hyperbolic axiom (if p $ G
and A+ is the asymptote to G+ through p, then A~ is not an asymptote to G~)
does not imply symmetry. The intersections of the curves in the system TV
just constructed with the domain x < 0, y > — x provide an example.
This corresponds to the fact that in the foundations of hyperbolic geo-
metry symmetry and transitivity of the asymptote relation are proved
with the help of the congruence axioms.
Other questions concern the distances from A+ to G+. In any straight
space the existence of points xveA + and yv e G+ which tend on A +
and G+ in the positive direction to oo and for which xvyv -> 0 is sufficient
for A+ and G+ to be asymptotes to each other. But boundedness of xG,
when x traverses a positive subray of A '• is neither necessary nor sufficient
for A+ to be an asymptote to at least one orientation of G. In fact, very
surprising phenomena occur even for the ordinary lines as geodesies.
Let g(t) be defined and continuous for t > 0, g(0) — 0, g(/i) < g(t%) for
t\ < t% and g(t) -> oo for / -> oo. Put
f(x, a) = signal cos a + #2 sin <x)g(\x\ cos a + x% sin a|).
Then the arguments of [3, Section 11] show readily that
(3) Pg(xty) =7|/(*,a)-/(y,a)|rfa
-7T/2
is a metrization of the (x\, #2) -plane as a G-space with the lines axi +
bxz + c — 0 as geodesies. pg(x, y) is invariant under x\=xi cos a— x% sin a
x2' — xi sin a + X2 cos a, so that the metric even possesses the rotations
about (0, 0). Nevertheless, simple estimates show that for g(t) — log(l -\-t)
any two parallel lines G\t G% have the property that pg(x, G%) — > 0 when x
traverses GI in either direction. For g(t) — e* — 1, we have p(xt GZ)^>OO.
In straight spaces which satisfy the condition (a) for nonpositive
curvature, the boundedness of xG when x traverses a positive subray of
A+ is necessary and sufficient for A + to be an asymptote to a suitable
orientation of G, so that the asymptote relation is transitive.
For the foundations of geometry it is of greater interest to see how
mobility eliminates the various abnormal occurrences. Assume a plane P
which is metrized as a straight space possesses a motion a which, reduced
to the straight line G, is a proper translation of G (z(t)<x, = z(t + a), a =£ 0).
AXIOMS FOR GEODESICS AND THEIR IMPLICATIONS 155
Then the asymptote relation is transitive within the families of asymptotes
to G+ and G~ (see [3, Section 32]). The parallels to G are exactly those
lines which go into themselves under a. They either cover all of P, or
a closed halfplane, or a closed strip which may reduce to G. If p does not
lie on a parallel to G, let GI be the parallel to G (possibly G itself) closest
to p. If x traverses an asymptote A+ to GI+ (or G+) in the positive di-
rection, then xGi -^0; if x traverses A + in the negative direction, then
A characterization of the higher dimensional euclidean geometry, [3,
Theorem (24.10)]. In the foundations of geometry parallelism for lines in
space reduces to that in a plane, because only spaces are considered in
which any three points lie in a plane. In straight spaces of higher dimen-
sion than two we mean by the parallel axiom the following two require-
ments :
The asymptote relation is symmetric. If A + is an asymptote to G+ then so
is A~ to G~.
The metrics pg(x, y) which can be extended to higher dimensions, show
that this parallel axiom and the Theorem of Desargues or the existence
of planes do not imply that the space is Minkowskian (finite dimensional
linear). Without any postulates regarding the existence of planes the
higher dimensional euclidean geometry is characterized by the above parallel
axiom together with the existence and symmetry of perpendicularity in this
sense :
(4) // p $ G, f G G, and pf — min px, then for every x E G also xf — min xy.
a-etf xeG(P.f)
It is well-known, see [3, p. 104], that there are Minkowski planes which
are not euclidean and satisfy (4). Also, (4) is without the parallel axiom a
weak condition; it is, for example, satisfied by every simply connected
Riemann space with non-positive curvature. This follows from [3, Theo-
rems (20.9) and (36.7)] and the symmetry of perpendicularity in Riemann
spaces.
Similarities and differentiability, [5]. It is natural to ask how we can
recognize from the behaviour of our finite distances xy whether a G-space
possesses differentiability properties in terms of suitable coordinates. The
best guide in a statement which is formulable and false without, but
correct with differentiability hypotheses.
156 HERBERT BUSEMANN
A proper similarity of a G-space R is a mapping a of R on itself such
that xocyx = kxy for all x, y E R where k is a positive constant different
from 1 . A proper similarity has exactly one fixed point /. Because or1 is
also a similarity and its factor is k~l, we may assume that k < 1 . Then
xa.vyoLv = kvxy and x<x?xoiv+l=kvxxai show that xocv is a Cauchy sequence
with a limit / and also that yot,v -> /. It follows readily that the space is
straight.
Linear spaces obviously possess similarities with arbitrary factors k,
but this does not characterize them among all G-spaces without differ-
entiability hypotheses. For if in (3) we choose g(t) — $, ft > 0, then
pg(dx, dy) = $p(x, y), where dx — (dxi, dx%), so that for 6& — k the mapp-
ing x -> dx is a similarity with the factor k', yet the space is linear only
for/j- 1.
Differentiability always means a locally nearly linear behaviour.
We say that a G-space is continuously differentiate atp\i for any sequence
of triples of distinct points av, bv, cv which tend to p, and for any points
bv', cv' with (avbv'bv), (avcv'cv] and avbv' : avbv = avcv' : avcv = tv, we have
lim bv'cv'ltvbvcv = 1 .
V-+00
(Differentiability would correspond to the special case av = p and proves
insufficient as our example shows.) A G-space is Minkowskian if it pos-
sesses one proper similarity a and is continuously differentiate at the
fixed point of a.
This form of differentiability is adequate also for the problem originally
posed: Let a G-space 7^ be continuously differentiable at p. Put pt — p,
and for q ^ p, let (pqtq) and pqt : pq = t. Then for x, y e S(p, pp), the
limit
mp(x, y) = lim xtytlt
<->o i-
exists. Obviously mp(p, x) = px. The metric mp(x, y) can be extended so
as to yield a space satisfying I, II, III, and IV in the strong form that a
point z with (xyz) exists for any x --£ y. In general V will not hold for this
"tangential metric" at p. If V does hold we say — following the termi-
nology of the calculus of variations — that the space is regular at p.
If the space is continuously differentiable and regular at p, then the
metric mp(x, y) is Minkowskian and xyl,mp(x) y) -> 1 , for x ^= y and
x -> P> y -+ P-
AXIOMS FOR GEODESICS AND THEIR IMPLICATIONS 157
If the space R is a Finsler space with ds — f(x, dx) as above and R is of
class Cm, m>4, and / is of class Cm~l for dx ^ 0, then R will, in S(p, pp) —p,
be at least of class Cw~2, and / of class Cm~3 in affine coordinates
belonging to mp (normal coordinates). Thus we obtain a complete de-
cision of the problem whether a G-space is a Finsler space of class C°°
and a partial solution for finite m.
Two dimensional Riemann spaces, [1]. The great variety of metrics
satisfying the Axioms I to V indicates that relaxing these axioms es-
sentially without adding others leads to spaces with too little structure for
a significant theory. On the other hand there are surfaces in £3 like poly-
hedra and general convex surfaces which do not satisfy our axioms,
hence still less the assumptions of classical differential geometry, but are
geometrically most interesting.
It is the purpose of A. D. Alexandrov's theory to define and study
(intrinsically and extrinsically) a class of surfaces narrow enough to
encompass deep results and yet wide enough to include, for example, the
mentioned surfaces. He assumes that the space R be a two-dimensional
manifold, metrizcd such that the distance of any two points x, y equals
the greatest lower bound of the lengths of all curves from x to y. (If II
holds, this implies III.) The problem is how to introduce the Riemannian
character of the metric without differentiability. Alexandrov's principal tool
is the (upper] angle <x.(T, T'} between two segments T, T' with the same
origin z: If x(t), y(t), t > 0, #(0) == y(0) = zt represents T and T', then
,~ ™ r & + *2 -
a(7 , i) = lirn sup arc cos — ----
where 0< arc cos < n. For a geodesic triangle D with sides T, T', T"
we define the excess as
e(D) = «(r, r) + «(r, r ') + a(r", r) - n.
The Riemannian character of the metric enters through the require-
ment that for every compact subset M of 7? a number fl(M) exists such
that for any finite set of non-overlapping triangles DI, . . . , Dm, in M
(5) 2>(/>i)|<«Af).
t 1
According to Zalgaller [17] it suffices to require 2 e(^) < P(M), in
other words, the triangles with negative excess never cause any trouble.
158 HERBERT BUSEMANN
That the condition (5) is really essentially Riemannian follows from the
fact that a Minkowski plane does not satisfy it unless it is euclidean.
To these so-called surfaces with bounded curvature Alexandrov extended
some of the deepest theorems of differential geometry; we mention only
Weyl's problem : Let the sphere or the plane be metrized such that 7, //, ///
and (5) hold. Assume moreover that the excess e(D) > 0 for all small geo-
desic triangles. Then the metric can be realized by a closed or by a complete
open convex surface in £3. Actually the open surfaces were not covered by
the classical methods, and the results of Alexandrov on the deformation of
convex surfaces surpass by far anything obtainable by the traditional
approach.
The theorem of Nash-Kuiper on the CMmbeddability in E3 of given
abstract two-dimensional Riemannian manifolds in the classical sense
stresses the significance of these results, because it shows that a reasonable
and general class of surfaces in E3 cannot be defined in terms of differ-
entiability conditions only.
Bibliography
[1J ALEXANDROW, A. D., Die innere Geometrie der konuexen Fldchen, Berlin 1955,
XVII + 522pp.
[2J BOCHNER, S., Vector fields and Ricci curvature, Bulletin of the American
Mathematical Society, vol. 52 (1946) pp. 776-797.
[3] BUSEMANN, II., The geometry of Geodesies. New York 1955, X -f 422 pp.
[4] , Metrications of projective spaces. Proceedings of the American Mathe-
matical Society, vol. 8 (1957) pp. 387-390.
[5] , Similarities and differentiability. Tohoku Mathematical Journal, Sec,
Ser., vol. 9 (1957), pp. 56-67.
L6j f Convex Surfaces. New York 1958, VII -1 194 pp.
[7] 1 Spaces with finite groups of motions. Journal de Math6matiques pures
et appliqu6es, 9th Ser., vol. 37 (1958) pp. 365-373.
(8] HADAMAKD, J., Les surfaces a courbures opposes et leur lignes geoddsiques.
Journal de Mathematiques pures et appliquees, 5th. Ser. vol. 4 (1898), pp.
27-73.
[9] HILBKRT, D., Grundlagen der Geometrie. 8th. eel., Stuttgart 1956, VII + 251
pp.
[10] HOPF, E., Closed surfaces without conjugate points. Proceedings of the National
Academy of Sciences, vol. 34 (1948) pp. 47-51.
[1 1] KELLY, P. J., and STRAUS, E. G., Curvature in Hilbert geometry. Pacific Journal
of Mathematics, vol. 8 (1958) pp. 119-126.
AXIOMS FOR GEODESICS AND THEIR IMPLICATIONS 159
[12] PKDERSKN, F. P., On spaces with negative curvature. Matematisk Tidesskrift B
1952, pp. 66-89.
[13] SKORNYAKOV, L. A., Metrication of the protective plane in connection with a
given system of curves (Russian). Izvestiya Akademii Nauk SSSR, Scriya Mate-
matiCeskaya 19 (1955) pp. 471-482.
[14] TITS, J., Sur certaines classes d'espaces homogenes de gronpes de Lie. Academic
royalc de Rclgique, Classe clcs sciences, Memoirs, Collection in-8°, vol. 39
fasc. 3 (1955), 268 pp.
[15] WAT.D, A., Begrundung einer koordinatenlosen Differentialgeometrie der Fid-
chen. Ergebnisse eines mathematischeii Kolloquiums (Wien) Heft 7 (1936),
pp. 24-46.
[16] WANG, H. C., Two-point homogeneous spaces. Annals of Mathematics, vol. 55
(1952), pp. 177-191.
[17] ZALGALI ER, V. A., On the foundations of the theory of two-dimensional manifolds
with bounded curvature (Russian). Doklady Akademii Nauk SSSR, vol. 108
(1956), pp. 575-576.
Symposium on the Axiomatic Method
AXIOMS FOR INTUITIONISTIC PLANE AFFINE GEOMETRY
A. HEYTING
University of Amsterdam, Amsterdam, Netherlands
1. Introduction. At first sight it may appear that the axiomatic method
cannot be used in intuitionistic mathematics, because there are only
considered mathematical objects which have been constructed, so that it
makes no sense to derive consequences from hypotheses which are not yet
realized. Yet the inspection of the methods which are actually used in
intuitionistic mathematics, shows us that they are for an important part
axiomatic in nature, though the significance of the axiomatic method is
perhaps somewhat different from that which it has in classical mathe-
matics.
In principle every theorem can be expressed in the form of an axio-
matic theory. Instead of "Every natural number is a product of prime
numbers" we can write "Axiom, n is a natural number. Theorem, n is
a product of prime numbers.". This way of presentation becomes prac-
ticable whenever a great number of theorems contains the same compli-
cated set of hypotheses. Thus, conversely, every axiomatic theory can be
read as one general theorem of the form : "Whenever we have constructed
a mathematical object M satisfying the axioms A, we can affirm about M
the theorems T."
Of course the content of the theory will be influenced by the intui-
tionistic point of view; in particular, questions of effective constructibility
will be of main importance. In order to give an idea of these differences I
shall show the method at work in an example, which I have so chosen
that the problem is trivial in classical mathematics, so that the intui-
tionistic difficulties appear, so to say, in their purest form.
2. The problem. In [1J and [2] I gave a system of axioms for intuition-
istic plane projective geometry. Here I wish to give a system of axioms for
plane affine geometry which is satisfied by the intuitionistic analytical
geometry, and which allows us to construct, by a suitable extension of the
plane, a projective plane which satisfies the axioms of [1] and [2]. This
problem is easy in the case of desarguesian geometry, because then the
160
AXIOMS FOR INTUITIONISTIC PLANE AFFINE GEOMETRY 161
extension can be effected by means of harmonic conjugates. If only the
trivial axioms of incidence are assumed, the problem is still easy from the
classical point of view, but it presents serious difficulties in intuitionistic
mathematics. These difficulties are caused by the fact that not only
points at infinity must be adjoined to the affine plane, but also points for
which it is unknown whether they are at infinity or not.
3. The axiom system for project! ve geometry.
FUNDAMENTAL NOTIONS :
Two disjoint sets $ and 8; the elements of *JJ are called points', those
of £ lines.
A relation #, whose domain and range are ^5; this relation is called
apartness.
A relation e, whose domain is $ and whose range is 2 ; this relation is
called incidence.
Notation'. Capitals in italics denote points; lower case italics denote
lines.
Free use is made of such expressions as "a line through a point"; the
translation into the incidence - language is left to the reader. Also, line /
is sometimes identified with the set of points incident with /, without
further explanation. It would be easy to avoid such identifications by a
somewhat clumsier presentation. In particular the notation I o m is used
for the set of points, which are incident with / as well as with m.
Logical signs are used as abbreviations. They must be understood in
the intuitionistic sense (see [3] and [4J or [5]).
->• stands for implication, & for conjunction, V for disjunction, -i for
negation,
(V#) is the universal quantifier (for every x),
(3x) is the existential quantifier (there exists an x such that).
AXIOMS FOR APARTNESS:
SI A #B -*B #A.
$2 -*A # B +-> A = B.
S3 A # B -* (VC)(C # A v C # B).
162 A. HEYTING
GEOMETRICAL AXIOMS :
PI A #B-+(3l)(A El&BEl)
P2 A#B&Aelnm&Belnm~+l = m.
DEFINITION 1 . A lies OUTSIDE / (A a> 1) if (VB)(B E I -> B # A).
DEFINITION 2. / lies APART FROM m (I # m), if (3A)(A e / & A co m).
P3 / # m -+(3A)(A Elr^n).
P4 A#B&Ael&BEl&Ctol&Aem&Cem-+Bo>m.
P5 (i) There exist two points, A and B, so that A # B\
(ij) Every line contains at least three points A, B, C, so that
A #B,A #Candfl # C;
(iij) When / is a line, a point outside / can be found.
DEFINITION 3. // A # B, then the line I satisfying A G I & B e / is
denoted by AB.
It can be proved from these axioms, that the relation # between lines
(Definition 2) is an apartness relation; this means that it satisfies axioms
SI -S3 for lines instead of points.
4. The axiom system for affine geometry.
FUNDAMENTAL NOTIONS: ^5, y, #, E, as in § 3.
AXIOMS FOR APARTNESS: S1-S3, as in § 3.
DEFINITIONS: 1 and 2 as in § 3.
GEOMETRICAL AXIOMS :
Al I #m,Aa>l-» (3p)(A e p & / r» p = I n m).
A2 A#B&Aelnm&Belntn-+l = m.
DEFINITION 4. / INTERSECTS m if I # m & (3 A) (A elnm).
A3 / intersects m -> (Vp)((3A)(A el^p)y (3B)(B
A4 A#B&AEl&BEl&Ca)l&A£m&CEm
A5 P
AXIOMS FOR INTUITIONISTIC PLANE AFFINE GEOMETRY 163
DEFINITION 5. / is PARALLEL to m (I // m) if (VA)(A el -^ A com).
Remark: A5 can now be formulated as follows: lf^m—0&m #/->// /m.
A6 (VQ(3w)(///w).
A7 (i) There exists at least one line ;
(ii) Every line is incident with at least four points every two of
which are apart from each other ;
(iii) A # B -> (3l)(A el&Bojl).
(iv) Ael-> (3m) (A e w & / # m).
Remarks: (1) If I and m have a common point B, Al asserts the ex-
istence of a line joining A and B (see Th. la). On the other hand, if / and m
have no common point, it follows from Al that there is a line through A
which does not intersect /; this is part of the assertion of existence of
parallels. Moreover, Al admits an assertion in the case that it is unknown
whether / intersects m. (2) A2 = P2. (3) A3 is a strong form of the unique-
ness assertion for parallels. (4) A4 is called the triangle axiom.
5. Elementary theorems.
THEOREM 1 . If I # m, A e I r\ m, B e / r» mt then A = B.
PROOF: Suppose A # B; then, by A2, / = m, which contradicts
/ # m. So A --^ B.
Remark: In the case of Th. 1 we write / r\ m = A.
THEOREM la. A # B -> (31) (A el & Bel).
PROOF: By A7(iii) there is a line p so that A e p & B CD p.
By A7(iv) there is a line m so that A em & p # m.
By Th. 1 , p r» m = A . By Al there is a line / so that B E I and p n 1 =
p r\ m = A ; it follows that A e I.
THEOREM 2. The relation # between lines is an apartness relation, i.e.
it possesses the properties (i)-(iii) :
(i) / # m -> m # /.
(ii) -i/ # m <-» I = m.
(iii) / # m -+ (\Tp) (p # I v p # m).
LEMMA 2. 1 . (iii) holds if I intersects m in S and Sep.
164 A. HEYTING
PROOF: Choose points A, B, C so that A el, A com (Dei. 2); B e m,
B #S\ Cem, C # S, C # B. This is possible by A7 (ii), S3. By A4,
B co AC, so AB # AC. By A3, p has a point in common with AB or with
AC i say D e p o AB. D # A v D # B. If D # A, then D co I (A4), so
p #l;iiD # B, then D eo w (A4), so p # m.
PROOF OF (i) : Choose A, Bt C so that Ael,Acom,Bem,Cem,B#C.
AB # 4C (A4). I #ABv I #AC (Lemma 2.1). If / # 45, we choose
Z) on / so that D co AB; then B co I (A4), so m # 1. Similarly, if / # 4C.
LEMMA 2.2. -i P eo / -> P e /.
PROOF : Choose C and w so that C co I, C em, P co m ; then I # m.
Choose A so that A em, A co I (Th. 2 (i)). P4 # w. By A3, / has a point in
common with PA or with m. If / intersects m in S, we choose B so that
Bem,Bco PA . PA # PB ; I has a point in common with PA or with PB.
Say Q el r\ PA . Now suppose P # Q ; then P CD £, which contradicts the
hypothesis, so -, P # <?, so P= (? (S2), so P e J.
PROOF OF (ii) : -^l # m -+ 1 = m is an immediate consequence of
Lemma 2.2, while / = w->-i/#wisa consequence of S2.
PROOF OF (iii) : Choose P, 0, #, 5 so that P E I, P com; Q, R, S em;
Q # R # S # Q. As in the proof of (i) it can be shown that at least two
of the points Q, R, S are outside /; say Q, R co 1. PQ # PR, so p has a
point in common with PQ or with PR; say A e p ^ PQ. A # P v A # Q.
Ii A # P, then A co I, so p # L If A # Q, then A co m, so p # m.
THEOREM 3. P co I -* (3m)(P em&m // 1).
PROOF: Draw a line £ // / (A6). By Al, there is a line m through P so
that l^m = l^p = 0.By A5,m I/ 1.
THEOREM 4. l//m&l//n&m#n-+ml/n.
PROOF: Choose P and Q so that P en, P com, Q em. PQ # m. By A3,
/ has a point vS in common with PQ. S co n ; by A4, Q co n. As Q is an arbi-
trary point on m, we have m // n.
6. Projective points.
DEFINITION 6. I # m -> $(/, w) = {#|/ r\m = In xv I r\m =mn*x}.
If I # m, then ty(l, m) is a PROJECTIVE POINT (abbreviation: p. point).
AXIOMS FOR INTUITIONISTIC PLANE AFFINE GEOMETRY 165
Remarks: Where y$(l, m) occurs, it is understood that I # m. German
capitals will be used to denote projective points. The next theorem shows
that the notion of a projective point is an extension of that of a line pencil
in the usual sense.
THEOREM 5. // / intersects m in S, then ty(l, m) is the class of all lines
through S. If I II m, then $(/, m) = [x\x // 1 v x // m}.
PROOF: The first part of the theorem is obvious. If l//m, and
n e ?p(/, m), then / r* n — 0 or m r\ n = 0; also / # n or m # n. The only
case which needs to be further considered islnn~0&m#n. Suppose
5 G m ^ n ; then by A3, / has a point in common with m, in contradiction
with I If m. Thus m n n = 0, so m // n. Conversely, if I // m and n // /,
then / r\ m — I r\ n, so w G $(/, m).
THEOREM 6. p, q e $(/, m) & £ # # -> $(/, w) = $(£, ?).
LEMMA 6.1. / # m &l #n&lr\m = mnn-+lr\m = lnn.
PROOF : It follows from the hypothesis that I r\ mCl r\ n. Suppose
P E I r» n ; then / intersects n, so m has a point in common with / or with
n (A3).
Case 1 . m intersects /. As / o n = P and I n mC I nn,we have I ^m =
P = l~n,
Case 2. Qem^n. Then Qelr\m, so Qel^n; it follows that
Q =^ P and that I r\nQl r\m.
COROLLARY: In the case that / # n we have w e ^$(/, w)W r\m =1 ^ n.
LEMMA 6.2. I #m #n #l&ne $(/, m) -> $(/, w) = $(/, n).
PROOF : By hypothesis and lemma 6. 1 we have lnm = lnn = m^n.
Suppose p e $(/, w). /> # I or p # m&p # n.
Case 1 . p # 1. Then I n m = I n p (lemma 6. 1), so Z <^ w = / n />, so
2. p # m&p # n. Now lr\m = mr\p. X G/ on -> X e/ ^m
-> X em ^ p, so X en r\ p. It follows that I r^nCn r\ p. Now suppose
Y en r\p. By A3, we may distinguish subcases 2a: (tn intersects w) and
2b: (w intersects />).
Case 2a. m intersects win Z. mr\n — lr\m = mr\p, so Z en n p.
It follows that Z = Y, and that Y em^n, so Y e I r\ n.
Case 2b. m intersects pmZ.Zemnp — lnm = lr\n,soZennp.
It follows that Y = Z and that Yemr\p = lr\n.
In case 2a as well as in case 2b we have proved that n r> p C I ^ n; thus
166 A. HEYTING
in case 2, l^n = nnp, that is p e *$(l, n). We have proved that
$(/, w)C*J}(/, n). In particular, m e ^(/, n) ; the same proof then gives us
LEMMA 6.3. : I #m&l #n&ne $(/, m) -> $(/ ,m) = $(/, n).
PROOF : Choose A and Z^ so that A E I, A co m, 13 em, 13 CD I.
By A3, n intersects / v (3C)(C e n n /IB).
If n intersects / in P, I *m = Inn = P, so s$(/, w) = s£(/, n).
If C e w n ^Z?, we have n # mv n # AB.
If n # w, we can apply Lemma 6.2.
If « # 4/?, we choose/) so that!) # ,4, 13, C andDeAB. By Al, there
is a line p so that D eft & p e s$(/, m). Now by Lemma 6.2, *£(/, m) --=
w. 0 = w »)•
PROOF OF THEOREM 6: We may suppose that q # I.
Choose n in s$(/, m) so that n # I, q (see the proof of lemma 6.3). By
Lemma 6.3 we have $(/, w) = $(/, w) = %(n, q) = ty(p, q).
DEFINITION 7. // / intersects m in S, then ty(l, m) is a PROPER p. point
and we write ty(l, m) — S. If I // m, then ^5(/, m) is an IMPROPER p. point.
Remark: It is by no means true that every p. point is either proper or
improper. However, as an immediate consequence of A5, a p. point that
cannot be proper, is improper.
DEFINITION 8. / lies OUTSIDE 51 (/ co 91) if ^ip)(p eW-+p #1).
DEFINITION 9. « lies APART from 93 (« # 8) if (3p)(p e 91 & p co »).
Remark. If 91 is a proper p. point A , then / CD 91 is equivalent with ^4 co I.
This is easily proved by means of Axiom A4. It follows that, for proper p.
points 91 = A and 93 = B, 51 # 93 is equivalent to A # £. Axiom Al
can now be read as follows:
// A is a proper p. point and 93 a p. point so that A # 8, then there exists
a line I so that A e / and I e S. This line will be denoted by A9$. It follows
from Th. 8 below that it is unique.
THEOREM 7. The relation # between protective points is an apartness
relation', that is, it possesses the properties (\\, (ii), (iii):
(i) 91 #<B->$( #93.
(ii) -, 91 # 93 <-> 51 = 8.
(iii) 91 # 93 -> (V(E)(« # (E v 8 # g).
AXIOMS FOR INTUITIONISTIC PLANE AFFINE GEOMETRY 167
LEMMA 7.1. m intersects l&Demnp&Dcol > p intersects I v p #m
PROOF: There is a line m' through D so that m' // / (Th. 3); m # m'.
p ^ m or p -/- m' ; if p -=£• m', then / intersects p (A3).
LEMMA 7.2 : C e / & C M m & / # n & / n m — / <^ w -> C o> n.
PROOF : Choose /I so that A En & A CD 1. AC ~ p. The line w intersects /
v n # p (Lemma 7.1). If n intersects /, then m intersects /, because
/ ^ m — / r\ n, so C (o n (A4). If n # p, then C to n (A4).
PROOF OF THEOREM 7 (i) : Choose / and m so that 91 = *£(/, m) and that
/ a) S3 ; choose C on / so that C co m. There is a line p so that C e p, p e 93
(Al) ;p#l. Let » be a line in 91; n # / v n # £.
If ^ # /, then C o> n (Lemma 7.2), so n # p.
We have proved that p o) 91, so 93 # 31.
PROOF OF THEOREM 7(ii): 91 = ^J5(/, m). Let /> be a line in 93, then
-i^> M $1. p # I v p # m ; suppose p # I. I shall prove that I ^ m — I <^ p.
Suppose X el r\ m, so that 91 consists of all the lines through X. If we
had X co p, then p o> 9(, which gives a contradiction, so X E p. Thus
/ o w C / r\ p.
Suppose Y e I ^ p. I derive a contradiction from Y at m, as follows:
Choose n in 91; n # I v n # p.
If n # I, then Y w n (Lemma 7.2), so n # p.
Now n # p for every n in 91, so p co 91. This is the desired contradiction.
Thus Y 6 mt and I r\ pCl r\m.
PROOF OF THEOREM 7(iii): Choose / in 91 so that / co 93; further m, r, s
so that 91 = $(/, m), (£ = $(rf s). I # r v / # s ; say / # r. As in the proof
of part (i), we find a line p in 93, so that p co 91 and that ^> intersects / in
D ; D can so be chosen that D co r [If D, E e / and 7J) # E, then D com
or E co m] this is shown in the proof of Lemma 2. 1 . By choosing three
points on / we find at least one point outside m and outside r]. Draw the
line i so that Det,te& (Al). I # t v p # t. Suppose / # /. For an arbi-
trary line u in (£ we have u # I v u # t. If u # t, then D ro « (Lemma
7.2), so u #1. It follows that / co (£, so 9( # (£. vSimilarly, ifp#t, we have
33 #C.
THEOREM 8. «#8&/e«o»&we«n»->/ = m.
PROOF : Suppose I # m', then 91 = / r> w and 93 — / ^ w. Apply Th.
168 A. HEYTING
7. Projective lines.
DEFINITION 9. 91 # 93 -> A(9I, 8)
// 91 # 8, then A(9l, 93) ts a PROJECTIVE line (£. «n*) ; where A(9(, 8)
occurs, it is understood that 91 # 93.
Greek lower case letters will be used to denote projective lines.
THEOREM 9. «#-8&/e«n8-» A(«f 93) = {g|/ e g}.
PROOF : I. Let K be a p. point such that /eg. g#?lvg#8; suppose
£ # 91. / e 91 n 93 and / e 91 n g, so 91 n 93 - 91 n g (Th. 8), so
II. Let $ be a p. point in A(«, 93). 9( * $ = 91 n 8 or 93 n $-91 n 93 ;
in either case / e ^5.
If, as in Th. 9, A(9l, 93) contains a line /, then it is called a PROPER
projective line; we write in this case A(9l, 93) = /.
THEOREM 10. // S # 8 and A(9(, 93) is a proper p. line I, then either
91 or 93 is a proper p. point.
PROOF: Choose m so that m e 91, m a> 93, and C so that C em, C co I.
C93 = n. By A3, / intersects m or w, so 91 is proper or 93 is proper.
DEFINITION 10. A p. point 91 lies OUTSIDE the p. line A (91 co A), if 91 is
apart from every p. point in L
THEOREM 11: 91 co I is equivalent with I co 91.
LEMMA 1 1.1 : // 91 # 93, 9T is proper = 4, ,493 = /, n e 93, n # I, then
A a) n.
PROOF: Choose P so that Pen, P col. AP = p.
By Lemma 7.1, n intersects I or n # p.
If w intersects /, then 93 is proper, so A co n by Axiom A4.
If n # p, then A con, also by Axiom A4.
LEMMA 1 1.2: // B e I and 91 co I, then BW # I.
PROOF: £9f = />; choose q in 9t so that q # p.
By Lemma 1 1 . 1 , B co q, so / # £.
AXIOMS FOR INTUITIONISTIC PLANE AFFINE GEOMETRY 169
Put $(/, q) = e. By Th. 10, either 51 is proper (51 = A) or (£ is proper
(C = Q.
If 91 = A, then A a> I, so AB # I.
If £ = C, then C co £ (Lemma 1 1.1), so / # 0.
PROOF OF THEOREM 11: I. Suppose 51 co /. Choose B on I', B$l = p.
By Lemma \\.2,p#l. Choose r in 81; r # / or r # #.
lfr#p, then # co y (Lemma 1 1 . 1), so I #r.
We .have proved that for every line r in 5(, I # r is valid, so / co 51.
II. Suppose / co 51, and let 93 be any p. point of /; then / e 93, so
51 # 93. Thus 51 co I.
THEOREM 12: // / is not apart from 51, then I e 21.
PROOF: This has been shown in the proof of Th. 7(ii).
DEFINITION 1 1 . Two p. lines A and p are APART from each other (A # fi)
if there exists a p. point 51 so that 51 e A #nd 51 co //.
We shall not prove directly that the relation •#. between p. lines is an
apartness relation; this follows from the main result of the paper, as it
has been derived in [1, 2] from the axioms Si -S3, P1-P5.
8. Proof of the projective axioms. Our problem is to prove that pro-
jective points and projective lines satisfy the axioms P1-P5 of projective
geometry. I have not succeeded in proving this from A1-A7; I must
introduce some further axioms, which I shall mention where I want them.
PI. If the p. points 51 and 93 are apart from each other, then there is
a p. line A which contains 51 and 93.
This is an immediate consequence of Def . 9.
P2. If the p. points 51 and 93 are apart from each other, and if both
are contained in the p. lines A and //, then A = ju.
This will be proved in Theorem 23.
P3. If the p. lines A and // are apart from each other, then they have a
p. point in common. This property, for the case that A or // is proper, is
affirmed in a new axiom A8.
A8. 21 # 93 & / co 51 -> (3<£)(/ e £ & (£ e A(5l, 93)).
The quantifier over projective points can be avoided. An equivalent
170 A. HEYTING
formulation is:
A'8: p#q&r#s&lco <$(p, q) & r co %(p, q)
# / & - %(p, g) o %(r, s) =
In the case that A(S21, 93) is a proper p. line, A8 follows from A1-A7 and
Def. 7. A8 suffices to prove P3, because it follows from Th. 17 below,
that if the p. lines A and /i are apart from each other, then A or p is a
proper p. line.
We now turn to P4, the triangle axiom. First we consider the cases
where two of the p. points 91, 33, K are proper (Theorems 13, 14, 15).
THEOREM 13. If A # B and (£ co AB, then A co #g.
PROOF: AB co & [Th. 1 1], so AB # B&, so A ay B&.
THEOREM 14. // A # W and C w A®, then A co C33.
PROOF: AW = I, AC = p, CW = n.
By Lemma 7.1, n intersects / or n # p.
If n intersects /, then we can directly apply A4.
If n # p, then A <n n.
THEOREM 15. If A ^ » and C a> AW, then W co AC.
PROOF: AC = /, AW = m, CW — n\ m # n.
Let p be any line in 33. p # in or p # n.
lip # m, then ^ co p (Lemma 1 1.1), so p # I.
If p # n, then B co p (Lemma 1 1.1), so p # I.
We have now proved that / co 93, so 33 co I (Theorem 11).
Let now at least one of the p. points be proper.
THEOREM 16. // A # 33 and (£ co AW, then W co Ad.
PROOF: AW = I, A& - p.
d co I, so / co & [Th. 1 1], so / # p.
Let m be any line in 33, m # / or m # p.
If m # I we choose Q on m, so that Q oj I, AQ = q.
m intersects I v m # q. (Lemma 7.1).
If m intersects /, then 33 is proper, 33 — B, and B co p by Th. 10, so
m #p.
If m # q, then A co m, so m # p.
We have proved that m # p i or every line m in 33, so p co W, so 33 co p
(Th. 11).
AXIOMS FOR INTUITIONISTIC PLANE AFFINE GEOMETRY 171
Note that Axiom A8 has not been used in the proofs of Theorems 13,
14, 15, 16.
There are two other cases of P4 in which only one of the three p. points
is proper. I have not succeeded in proving these from the preceding
axioms; therefore 1 introduce them as new axioms:
A9 If A # 93 and g « /!», then A co A(», g).
A10 //»#<£ and A n A(93, g), then 93 <o Ad.
It follows from the next theorem that the case in which none of the
three p. points is known to be proper, need not be considered.
THEOREM 17. // 31 # S3 and g o> A(3l, 93), then at least one of the p.
points 81, 93, g is proper.
PROOF: Choose / so that / e g, / co 81.
By A8, there is a p. point ® so that / e ®, ® e A(9t, 93). ® # g; by Th. 10
g is proper or $ is proper. If 2) is proper, 3) = />, then A(3l, 93) = Z)8l
is proper, so, again by Th. 10, 3t is proper or 93 is proper.
It remains to prove the uniqueness of the p. line through two p. points
which are apart from each other. We first prove some theorems about
improper p. points.
THEOREM 18: // 31 and 93 are improper p. points and 31 # 93, then
any line in 3f intersects any line in 93.
PROOF: Let / be a line in 31 and m a line in $8. Choose p in 31 so that
P ft> SJ, and choose C on p. C93 = q\ then q # p. By Th. 16, 31 o> q, so
q co 31, so / # q. p intersects q, so that we infer from A3 that / has a point
in common with p or with q.
If / has a point in common with p, then / cannot be apart from p,
because 31 is improper, so / = p, so / intersects q. It has now been proved
that in every case / intersects q. Repeating this argument for 93 instead of
31, we find that m intersects I.
THEOREM 19: // 31 # 93, £ # 31 and g * 93 = 31 r» 93, then E n 31 -
31^93.
PROOF : It is clear that 3i n 93 C 31 n g.
Let / be a line in 31 ri g ; we must prove that / e 93.
Case 1 : 31 or 93 is a proper p. point. Then A(8l, 93) is a proper p. line m.
m E 31 r» 3J, so m 6 31 r» g. By Th. 8, / = m, so / e ».
2 : (General case) . Because / e 31 r» g, 31 or g is proper, so that
172 A. HEYTING
we may now assume that g is proper. Choose D on I so that D # 93.
/)93 = n. We shall give an indirect proof for n = /.
Suppose n # I', then it is impossible that 51 or 93 is a proper p. point,
for in case 1 we know that / e 93, so / == D 93 = n. Thus 91 and 93 are both
improper. It follows from Th. 10 that 91 ^ 93 = 0, so 93 n £ = 0, so £ is
improper (Al). But (£ is proper; this contradiction proves that n # I is
impossible, so n = I, and / e 93.
COROLLARY. In the case that (£ # %l we have
l, 93) <->9l^(£ = 91^93.
THEOREM 20. // 21 araZ 93 are improper p. points, and 91 # 93,
A(9l, 93) « //&e 5^ of .all improper p. points.
PROOF: By Th. 10, 91 n 93 = 0. Let $ be any p. point in A(9i, 93) ; as
91 # $ or 93 # $, we may assume that 81 # $.
Then, by Th. 19, 91 r* ^5 = 91 ^ 93 = 0, so ^ is improper. Conversely,
if <{$ is improper, we have by Th. 10, that 91 ^ $ = 0 = 91 n 93, so
$e *(«,»).
The following theorem asserts the uniqueness of the p. point of which
the existence is affirmed in A8.
THEOREM 21. // 91 # 93, E fl^ 5) Wong to A(«, 93), £ o>
belongs to E an^ to 2), ^^n £ = *;£).
PROOF: Suppose g # S). As g # 91 and 5) # 9t, it follows from Th.
19 that 9l^93 = 9l^(£ = 9f^<®. Thus, if 91 n 93 contained a line /, we
should have / 6 K ^ 3), so, by Th. 8, / — p\ but this contradicts p a* 91.
It follows that 91 r* 93 = 0.
Thus 91, 93, E, $ are all improper p. points (Th. 20), and, as (£ # ®,
(£ ^ ® = 0, contradicting /> e (£ n ®. We have proved that (£ # 3) is
impossible, so K = S).
THEOREM 22. // 91 # 93 <w<Z *'/ */ w impossible that g co A(«, 93), then
(S eA(9I, 93).
PROOF: We treat the case that (£ # 9(. Choose ^> in g so that /> co 9(.
By A8, there is a p. point ® so that p e ©, © e A(«, 93). We give an
indirect proof for K = ©. Suppose that E # ©. As £ e £ n ©, (£ or © is
a proper p. point (Th. 10). If @ is proper, © = G, then A(9(, 93) is proper,
A(«, 93) = /; G91 = /, Gg = p. 9t co ©S, so (g co G«, K co A(«, 93), which
is impossible by hypothesis. If K is proper, E = C, then A(9l, (£) is proper,
AXIOMS FOR INTUITIONISTIC PLANE AFFINE GEOMETRY 173
(, (£) = m. Suppose that 93 CD m\ then we should have C a> A(9l, S3)
(A9), which is impossible by hypothesis. But if S3 e m, then A (2t,S3)
= w, (£e A ($,S3), (£ = ©. We have derived a contradiction from the
hypothesis that (£ # ©, so (£ = ®, and (E e A(«, 58).
Remark: A 9 is only used in this proof, A 10 nowhere in this paper.
However, A10 will be needed for the derivation of further theorems of
projective geometry.
The next theorem asserts P2 for p. points.
THEOREM 23. //«#»,«# C and (£ e A(«, S3)
PROOF: We first prove the theorem, making the extra assumption
that A(«, S3) is a proper p. line 1. In this case, / e 91 r* S3, so A(2l, S3) =
6 $}. Moreover, / e 51 n e, so A(«, E) - {^|/ e $}. Thus A(«, ») =
In the general case, let 3) be any p. point in A(9f, S3). Suppose that
3) w A($, 6); then, by Th. 17, at least one of the p. points «, e, 3) is
proper, so that A (91, S3) is proper, but in this case we have already proved
that A(«, S3) = A(«, ffi). Thus the assumption that 3) o> A(9l, E) has led to
a contradiction; by Th. 22, ®eA(9(, (£). We have now proved that
A(?l, S3) CA(«, S). In particular, S3 e A(8l, K). Now we prove by the same
argument, interchanging S3 and (£, that A(9l, E) C A(9l, »).
Bibliography
[1] HEYTING, A., Intuitionistische axiomatiek der projectieve meetkunde. Thesis
University of Amsterdam. Groningen 1925.
[2] , Zur intuitionistischen Axiomatik der projektiven Geometric. Mathematische
Annalen, vol. 98 (1927), pp. 491-538.
[3] , Die formalen Regeln der intuitionistischen Logik, Sitzungsberichte prcuss.
Akacl. Wiss. Berlin (1930), pp. 42-56.
[4] f Die formalen Regeln der intuitionistischen Mathematik, Sitzungsberichte
preuss. Akad. Wiss. Berlin (1930), pp. 55-71.
[5] , Intuitionism, an introduction. Amsterdam 1956.
Symposium on the Axiomatic Method
GRUNDLAGEN DER GEOMETRIE VOM STANDPUNKTE
DER ALLGEMEINEN TOPOLOGIE AUS
KAROL BORSUK
Universitdt Warschau, Warschau, Polen
Das Problem die Geometrie axiomatisch zu begtiinden, das zum erst en
Mai von Euklid gestellt und gelost wurde, hat bis jetzt seine Aktualitat
nicht verloren. Am Ende des 19 Jahrhunderts hat Hilbert [9] die be-
riihmte Axiomatik clcr Geometrie angegeben, womit er cine wesentliche
Vertiefimg und Vervollstantligung der Ideen von Euklid erzielt und der
Geometrie die Gestalt einer deduktiven Theorie, im modernen Sinne,
gegeben hat. Durch die Axiomatik von Hilbert ist auch das Verhaltnis
zwischen den Kuklidischen und der hiperbolischcn Geometrie von Lo-
batschefsky uncl Bolyai endgiUtig erklart.
Dadurch wurde aber clas Problem der Grundlagen der Geometrie kei-
neswegs auserschopft. Einerseits, ofnete die Entwicklung der mathe-
matischen Logik weite Moglichkciten ftir die Untersuchung der logischen
Struktur der Geometrie, als einer deduktiven Theorie. In dieser Richtung
geht ein bedeutender Teil der modernen Studien auf dem Gebiete der
Grundlagen der Geometrie. Anclererseits, dutch die Einfiihrung von ver-
schiedenen Typen der allgemeinen Raume ist das Problem entstanden,
die Lage der klassischcn Raume unter samtlichen abstrakten Raumen
aufzuklarcn.
Das Problem die klassischen Geometricn unter samtlichen Riemann-
schen Geometrien zu charaktcriesieren, wurde schon im 19 Jahrhundert
von Sophus Lie [15] gestellt. Lie, zu den friiheren Idcen von Helmholtz [8]
ankniipfend, hat diesem Problem die Gestalt des Problems einer Charak-
terisierung der Gruppe der starren Bewegungen gegeben. Aber erst die
Entstehung der allgemeinen Mengenlehrc und der auf ihr gestiitzten
axiomatischen Theorie det allgemeinen, abstrakten Raume, hat dem
Problem der Grundlagen der Geometrie eine wirklich allgemeine, moderne
Gestallt gegeben. Von diesem allgemeinen Standpunkte aus wurde das
von Lie gestelltes Problem die Geometrie mit Hilfe der Gruppe det
starren Bewegungen zu begriinden, im Jahre 1930 von Kolmogoroff [11]
angegriffen. Kolmogoroff hat ein System von Axiomen angegeben, dutch
174
GEOMETRIE UND TOPOLOGIE 175
das die Klasse der Riemannschcn Raume mil der konstanten Krummung
charaktcrisiert wurde. Leidcr hat Kolmogoroff die vollstandigen Beweise
seiner Behauptungen nicht veroffentlicht und erst in den letzten Jahren
sind zwei Arbeiten von Tits [20], [21] und dann eine Arbeit von Freuden-
thal [7] erschienen, in den eine, im gewissen Sinne endgiiltige Losung des
Raumproblems auf dem Boden der Charakterisierung der Bewegungs-
gruppe gegeben ist. In der Arbeit von Freudenthal ist auch eine wcit-
gehende Klassifikation der auf diese Weise charakterisierten Geometrien
angegeben.
In eine etwas andere Richtung gehen die Arbeiten, die zu einer Charak-
terisierung der klassischen Raume auf dem Boden der durch Menger [17]
entwickelten allgemeinen metrischen Geometrie streben. Ausser den we-
sentlichen Ergcbnissen von Menger [17], [18], soil man hier die Ergebnisse
von Wilson [26], von Garret Birkhoff [2], von Blumenthal [3], von H. C.
Wang [25] und von anderen nennen. Wilson charakterisierte die Eukli-
dischen Mctriken mit Hilfe eincr gewissen metrischen Eigenschaft jeder
vier Punkten des Raumes. Garret Birkhoff, und die anderen, stiitzen ihre
Untersuchimgen auf clem Postulate einer metrischen Homogenitat.
In dicsem Vortrage mochte ich mich mit einer Charakterisierung der
klassischen Raume auf dem Boden eincr Klassifikation der allgemeinen
topologischeri Raume bcschaftigen. Es handelt sich dabci vor allcm urn
cine Formulicrung des Problems und um die Andcutung der hier ent-
stchcndcn Schwierigkcitcn. Ich bin nicht imstande eine definitive Losung
des Problems anzugeben und ich meine sogar, dass wir noch fern da von
sind. Ich mochte nur cine parzielle Losung angebcn, die als cine Illustra-
tion der allgemeinen Tendenz dieser Betrachtungen dienen kann.
Um die Grundlagcn der Geometrie auf eincr breiten Basis der topolo-
gischen Eigcnschaften zu bauen, braucht man eine meh rausgebaute Sys-
tematik der topologischcn Raume zu bearbeiten. Bis jetzt ist eine solchc
Systcmatik wenig entwickelt. Am bcsten ist es mit der Systematik der
allgemeinsten Typen der topologischen Raume. Verschiedene Axiome:
der Rcgelmassigkeit, der Normalitat, der Basis und so weiter erlaubcn
gewisse Klassen von Raumen mit mehr oder wcnigcr reichem geometri-
schen Inhalt zu definieren. Aber die Klassifikation von den mehr speziel-
len Raumen ist bis jetzt hochst mangelhaft und wir sind immer fern von
der topologischen Bestimmung der wichtigen Klasse der sogenannten
Polyeder. Ich verstehe dabei hier, unter den Polyedern, solche separable
Raume, fur die eine lokal endliche Triangulation exist iert. Ich glaube, dass
erst eine Entwicklung der Systematik von den topologischen Raumen,
176 KAROL BORSUK
eine angemessene Grundlage zur genauen Aufklarung der Natur der klas-
sischen Raume schaffen wird. Da aber die Geometric, neben topologi-
schen, auch metrische Axiome braucht, so scheint mir, dass eine fur oben
genannte Zwecke niitzliche Axiomatik, neben topologischen auch me-
trische Axiome enthalten soil und zwar unter der Beriicksichtigung fol-
gendes ,,Prinzip$ eines topologisch-metrischen Parallelismus" :
PRINZIP [jT||M], Die topologischen Axiome sollen die Existenz einer den
metrischen Axiomen genilgenden Metrik implizieren. Die metrischen Axiome
sollen die Erfullung der topologischen Axiome implizieren.
Die wohlbekannten Schweirigkeiten bei Versuchen die Euklidischen
Raume topologisch zu charakterisieren haben zur Folge, dass eine voll-
standige Realisierung des Prinzips [T||M], bei aktuellem Niveau der To-
pologie, eher aussichtlos ist. Man kann aber den Entwurf einer Axiomatik
der Euklidischen Geometrie angeben, einer Axiomatik, die wenigstens
teilweise dieses Prinzip beriicksichtigt.
Es ist zweckmassig unsere Axiome in drei folgende Gruppen zu teilen :
I. Axiome der allgemeinen Raume.
II. Axiome der Regelma'ssigkeit.
III. Spezielle Axiome.
Als die zur ersten Gruppe gehorenden Axiome kann man irgendeine
Axiome, die die metrisicrbaren, separablen Raume charakterisieren,
wahlen. Man kann, zum Beispiel, die drei Axiome der abgeschlossenen
Hiille (von Kuratowski [13], [14]), das Axiom der Normalitdt und das
Axiom der abzdhlbaren Basis nehmen. Wenn wir, wie ublich, die abge-
schlossene Hiille der Menge X mit X bezeichnen, so hat die erste Gruppe
der topologischen Axiome die folgende Gestallt (siehe [14]):
AXIOME (/, T).
(i) X^TY = Zv 7.
(2) Falls X leer oder einpunktig ist, so ist X = X.
(3) X = X.
(4) Fur je zwei disjunkte, abgeschlossene Mengen X und Y gibt es eine
offene Menge G von der Art, dass X C G und G ^ Y — 0 ist (Axiom der
NORMALITAT).
(5) Es gibt eine Folge {Gn} von offenen Mengen von der Art, dass jede
offene Menge Vereinigungsmenge gewisser Mengen dieser Folge ist (Axiom
der ABZAHLBAREN BASIS).
GEOMETRIE UND TOPOLOGIE 177
Als die entsprechende Gruppe der metrischen Axiome nehmen wir die
drei Axiome der metrischen Rdume von Frechet (mit einer Modifikation
von Lindenbaum [16]) und das Axiom der Separabilitdt. Wenn wir, wie
iiblich, die Entfernung von dem Punkte x bis dem Punkte y mit Q(X, y)
bezeichnen, so hat die erste Gruppe der metrischen Axiome die folgende
Gestalt:
AXIOME (I, M).
(1) p(x, y) ist reel.
(2) p(x, y) = 0 dann und nur dann wenn x — y.
(3) p(x,y)+p(x,z) >p(y,z).
(4) Es gibt eine im Raume dichte, hochstens abzdhlbare Teilmenge (Axiom
der SEPARABILITAT).
Der wohlbekannte Metrisationssatz von Urysohn [23] besagt, dass die
Axiome (/, T) die Existenz einer den Axiomen (/, M) geniigenden Metrik
implizieren. Auch umgekehrt, die Existenz einer Metrik, die den Axio-
men (/, M) geniigt, hat zur Folge die Erfullung samtlicher Axiome der
Gruppe (/, T). Somit bei diesen Axiomen ist das Prinzip des topologisch
metrischen Parallel ismus erflillt.
Die Raume, die der erst en Gruppe der Axiome geniigen, bilden eine
wichtige und gut bekannte Klasse. Aber diese Klasse ist so allgemein,
dass sie auch viele Raume mit recht komplizierten und wenig anschau-
lichcn Eigenschaften enthalt. Die zweite Gruppe von Axiomen soil unter
den allgemeinen metrischen Raumen eine Klasse von Raumen mit beson-
ders einfachen, anschaulichen Eigenschaften bestimmen. Diese Axiome
sollen somit verschiedene, sogenannte pathologische Phdnomene elimineren
[5]. Zusammen mit den Axiomen der ersten Gruppe, sollen sie eine topo-
logische Grundlage fur jede ,,vernunftige" Geometrie bilden. Im Gegen-
satz zu der genau bestimmten ersten Gruppe von Axiomen, die Aufst el-
lung der Axiome der zweiten Gruppe ist wenig bestimmt. Es scheint, dass
diese Axiome vor allem die lokalen Eigenschaften des Raumes anbetreffen
sollen und eine Klasse der Raume definieren, die hinreichend umfangreich
sein soil um alle Polyeder zu enthalten, aber hinreichend speziell, um alle
Raume mit paradoxalen Eigenschaften ausschliessen. Ich werde hier diese
Gruppe von Axiomen nur provisorisch folgendermassen aufstellen:
AXIOME (II, T).
(1) Lokale Kompaktheit.
(2) Lokaler Zusammenhang.
178 KAROL BORSUK
Sicher 1st die so aufgestellte Axiomgruppe nicht hinreichend um die
,,verniinftige Raume" zu charakterisieren. Fur unseren bescheidenen
Zweck, einen sehr unvollkommenen Prototypus einer topologisch-metri-
schen Axiomatik zu geben, wird sie aber genugen.
Die entsprechende Gruppe der metrischen Axiome besteht aus zwei
Axiome :
AXIOME (II, M).
(1) Kompaktheit der beschrdnkten abgeschlossenen Mengen.
(2) Lokale Konvexitdt.
Man sagt dabei, dass ein Kaum X lokal konvex ist, wenn fur jeden
Punkt a e X eine Umgebung U existiert von der Art, dass fiir je zwei
Punkte x, y e U mindestens einen Punkt z e X gibt, fiir den
P(x, z) - P(y, z) = \ -p(x, y)
gilt. Jeder solche Punkt z soil ein Mittelpunkt des Paares x, y heissen.
Es ist bckannt (vgl. Menger) dass in einem metrischen Raume A",
in dcm die Axiome (II, M) erfiillt sind, gibt es fiir jeden Punkt a e X
eine positive Zahl ? von der Art, dass jeder Punkt x e X — (a), der erne
Entfernung von a kleiner als r hat, mit a (lurch eine geradlinige Strecke
verbunden sein kann. Daraus ergibt sich ohne Wciteres, dass die Axiome
(II, M} die Axiome (II, T) zur Folge haben. Ob auch umgekehrt, die
Axiome (II, 7^) (zusammen mit den Axiomen cler crstcn Gruppe) die
Exist enz einer den Axiomen (II, M) geniigenden Metrik zur Folge haben,
ist noch nicht endgiiltig aufgeklart. Die Ergebnissc von Bing [1J, der die
von Menger ausgesprochene Vermutung der konvexen Metrisierbarkeit
der lokal zusammenhangenden Kontinua bewiesen hat, und auch die
neulich erhaltenen Ergebnisse von japanischen Mathematikern Tanaka
und Tominaga [19], [22], erlauben aber zu vermuten, dass bei diesen
Axiomen das Prinzip [7]|M] erfiillt wird. Da aber die zweite Gruppe der
Axiome sicher nicht als definitiv aufgestellt gelten kann, so meine ich
dass die wesentlichen Schwierigkeiten erst bei einer angemessenen Ver-
vollstandigung dieser Gruppe der Axiome erscheinen werden.
Die Axiome der dritten Gruppe sollen die spezifischen Eigenschaften
des anbetreffenden Raumes angeben. Falls wir das Prinzip des topolo-
gisch-metrischen Parallelismus realisieren wollen, so sollen die Axiome
(III, 7'), zusammen mit den Axiomen (I, T) und (II, T), den bctrachteten
Raum topologisch vollstandig charakterisieren. Eine topologische Cha-
rakterisierung der Euklidischen Ebene war schon vor vielen Jahren von
GEOMETRIE UND TOPOLOGIE 179
van Kampen [10] gegeben. Somit, in diesem Spezialfalle, bietet eine Auf-
stellung der dcm Prinzip [7]|Af] geniigenden Axiomatik keine Schwierig-
kciten. Eine ahnliche Axiomatik fur hoherdimensionale Euklidische
Raume zu finden ist eine unvergleichlich schwierigcre Aufgabe. Es zeigt
sich aber moglich, die Axiomgruppen (III, 2') und (III, M) so anzu-
geben, dass sic mitgesamt und mit den Axiomen der erst en und zweiten
Gruppe eine vollstandige Axiomatik der elementaren Geometric bilden.
Das Prinzip [71 |M] wird dabei vernachlassigt, weil die von ihm verlangte
Aussonderung der topologischen und metrischen Axiome nicht erflillt
wird. Die Schwierigkeit eine solche Aussonderung zu realisieren liegt
hauptsachlich auf der topologischen Seite. Somit wird man desto naher
zu der Realisierung des Prinzipes [2]|M] kommen, je reicher der Inhalt
der topologischer Axiome (III, T) wird. Anders gesprochen, ist es zweck-
massig die Rolle der metrischen Axiome (III, M) moglichst weit zu
reduzieren.
Um die Axiome der Gruppe (III, T) fiir Euklidische Raume zu formu-
lieren, werde ich folgenden Begriff benutzen :
Ein wahrer Zyklus (im Sinne von Victor is [24]) y ist im Raume X mit
dcm Punkle x E X verschlungen, wenn y einen kompakten Trager
A C X — (x) hat und y nicht homolog Null in der Menge X — (x) ist.
Nun besteht die Axiomgruppe (III, T) von vier folgenden Axiomen:
AXIOME (III, T).
(1) Der Raum ist zusammenhdngend.
(2) Die Dimension des Raumes ist gleich n.
(3) In jeder Umgebung jedes Punktes des Raumes gibt es einen wahren
Zyklus, der in dem Raume mit diesem Punkte verschlungen ist.
(4) Fur jeden wahren Zyklus des Raumes ist die Menge der Punkte mit
den dieser Zyklus verschlungen ist, of fen.
Offenbar sind die Axiome der Gruppen (I, T), (II, T), (III, 7^) nicht
hinreichend um den Euklidischen w-dimensionalen Raum topologisch zu
charakterisieren. Sie sind erfiillt, zum Beispiel, durch jede offene n-di-
mensionale Mannigfaltigkeit und auch durch verschiedene andere Raume.
Man kann sie in verschiedene Weisen verstarken. Da wir aber, wie ich
schon gesagt habe, eine rein topologische Charakterisiering des n-di-
mensionalen Euklidischen Raumes nicht angeben konnen, sind wir ge-
zwungen das Prinzip \_T\\M} vernachlasigend, gewisse metrische Bedin-
gungen einzufuhren. Um den w-dimensionalen Euklidischen Raum voll-
180 KAROL BORSUK
standig zu charakterisieren, geniigt es zu den oben genannten Axiomen
folgendes metrische Axiom (III, M) hinzufiigen:
AXIOM (III, M).
Jede vier Punkte des Raumes sind zur gewissen vier Punkten des Eukli-
dischen 3-dimensionalen Raumes E$ kongruent.
Dieses Axiom, das von Menger [17] formuliert und von Wilson [26] und
den anderen benutzt war wird, wie iiblich, Vierpunktebedingung genannt.
Die aus den drei Gruppen der topologischen und der metrischen Axiome
bestehende Axiomatik des Euklidischen n-dimensionalen Raumes ist in
Wirklichkeit nur eine Modifikation der rein metrischen Axiomatik, die im
Jahre 1932 von Wilson [26] angegeben war. Der Zweck dieser Modifika-
tion war, die metrischen Axiome im moglichst grossen Masse durch die
topologischen zu ersetzen um somit zur Realisierung des Prinzips [7]|Af]
naher zu kommen.
Nun werden wir zeigen, dass unsere Axiomatik den Euklidischen n-
dimensionalen Raum En vollstandig charakterisiert.
Aus den Axiomen der ersten und zweiten Gruppe folgt, dass fiir jeden
Punkt a e X eine offene Umgebung U existiert von der Art, dass jeder
Punkt b e U mit a durch eine geradlinige Strecke sich vereinigen lasst.
Wir konnen dabei annehmen, dass diese Umgebung beschrankt ist. Aus
der Vierpunktbedingung ergibt sich, dass die a und b verbindende Strecke
eine einzige ist. Wir werden sie mit ab bezeichnen. Da U beschrankt ist,
ist auch die Vereinigung allcr dieser Strecken beschrankt. Wir schliessen
leicht, indem wir das erste von den Axiomen (II, M) anwenden, dass die
Strecke ab stetig von ihren Endpunkten a und b abhangt.
Um nun zu zeigen, dass jede Strecke ab sich aussen den Endpunkt b
verlangern lasst, betrachten wir eine Umgebung V des Punkt es b so klein,
dass sie den Punkt a nicht enthalt und dass je zwei ihre Punkte sich durch
eine eindeutig bestimmte und von ihren Endpunkten stetig abhangende
Strecke vereinigen lassen. Nach dem dritten der Axiome (III, T) gibt es in
der Umgebung V einen wahren Zyklus y der mit b verschlungen ist
(vgl. Abb. 1). Betrachten wir einen kompakten Trager B C V — (b) und
einen Punkt c e ab ^ F, der von b verschieden ist. Man sieht leicht, dass
der wahre Zyklus y in der Vereinigungsmenge aller Strecken ex mit
x 6 B, homolog Null ist. Da y nicht homolog Null in der Menge X — (b)
ist schliessen wir, dass es einen Punkt b' e B gibt von der Art, dass die
Strecke cb' den Punkt b enthalt. Mit Hilfe der Vierpunktebedingung
GEOMETRIE UND TOPOLOGIE
181
schliessen wir, dass ac v cb' die gesuchte Verlangerung der Strecke ab ist.
Daraus crgibt sich leicht (durch Anwendung des ersten der Axiome
(II, M)) dass es eine Halbgerade gibt, die a als ihren Endpunkt hat und
den Punkt b enthalt. Mil Hilfe des ersten der Axiome (II, M) und der
Vierpunktebedingung zeigt man ferner, dass diese Halbgerade stetig vom
Punkte b abhangt in dem Sinne, dass der um 5 > 0 von a entfernte Punkt
dieser Halbgerade stetig von s uncl b abhangt.
Fig. 1
Nun werden wir zeigen, dass es firr jeden Punkt % e A" eine Halbgerade
H gibt, die a als ihren Endpunkt hat und den Punkt x enthalt. Nach dem
zweiten der Axiome (II, M) gibt es eine Zahl r > 0 von der Art, dass eine
solche Halbgerade fiir alle Punkte % mil p(a, x) < r existiert. Es bezeichne
Q die Vollkugel um a mit dem Radius r. Wir setzen voraus, dass
p(a, x) > r > 0
gilt. Da der Raum X zusammenhangend (Axiom (III, T), 1), lokal zu-
sammenhangend (Axiom (II, T) 2) und lokal kompakt (Axiom (II, T), 1)
ist, gibt es in X einen einfachen Bogen B, der a und x vereinigt. Es sei s
eine Zahl, die grosser als der Diameter von B ist. Nach dem dritten der
Axiome (III, T) gibt es in Q einen wahren Zyklus y, der mit a verschlungen
ist. Es bezeichne A den in Q — (a) enthaltenen, kompakten Trager von y
(vgl. Abb. 2). Da Q in sich selbst zu dem Punkte a zusammenziehbar ist,
ist dieser Zyklus in Q homolog Null. Somit ist er mit dem Punkte x nicht
verschlungen.
(i) y ist verschlungen mit a,
(ii) y ist nicht verschlungen mit x.
Nun nehmen wir an, dass keine der Halbgeraden mit dem Endpunkt a
182 KAROL BORSUK
den Punkt x enthalt. Fur jeden Eckpunkt p des Zyklus y bczeichnen wir
mit y(p) den um s von a entfcrnten Punkt der Halbgerade ap. Da die
betrachtete Halbgerade stetig vom Punkte p abhangt schliessen wir, dass
(p den wahren Zyklus y auf einen wahren Zyklus y* abbildet. Dabei sind
die wahren Zyklen y und y* in der Vcreinigungsmenge aller Strecken
P<p(P)> w° P £ A, homolog. Da aber diese Vereinigungsmenge keinen der
Fig. 2
Punkte a und x enthalt, schliessen wir, dass
(iii) y ~ y* in X — (a) — (x)
gilt. Daraus und aus (i) und (ii), folgt
(iv) y* ist verschlungen mit a,
(v) y* ist nicht verschlungen mit x.
Aber der Zyklus y* liegt auf der Oberflache 5 der Kugel um den PunlA
a mit dem Radius s. Da s von dem Diameter 6(B) von B grosser ist folgt,
dass B mit S punktfremd ist. Wenn wir nun (iv) und das vierte der Axiome
(III, T) beachten, so sehen wir dass der Zyklus y* mit dem Punkte x
verschlungen sein soil, was der Bedingung (v) widerspricht.
Somit haben wir gezeigt, dass je zwei verschiedene Punkte unseres
Raumes auf einer Geraden liegen. Wie aber W. A. Wilson gezeigt hatte
[26] ist ein metrischer, w-dimensionaler, separabler Raum, in dem je zwei
GEOMKTRIE UND TOPOLOGIE 183
Punkte auf eincr Geraden liegen, wobei das Axiom (III, M) erfiillt 1st,
mit clem n-dimcnsionalen Euklidischcn Raume En kongruent. Somit
sehen wir, class unscrc Axiome den Raum En vollstandig charakterisieren.
Der Hauptmangel der angcgcbenen Axiomatik besteht darin, dass in
der dritten Gruppe der Axiome die topologischen und die metrischen
Voraussctzungen nicht getrennt sind. Ich habe schon erwahnt, dass bei
dem aktucllcn Stande der Topologie diese Trennung als aussichtslos
bctrachtet werclen kann.
Im Falle der Ebene stellt sich aber die Sache anders. In diesem Falle
konnen wir die von van Kampen angegebene topologische Charakteri-
sierung der Euklidischen Ebene verwenden [10J. Ich werde diese Axio-
matik in einer etwas modi fizier ten Gestalt angeben, um die Verwendung
der in der ursprunglichen Axiomatik von van Kampen gebrauchten
speziellen Bcgriffen eines einfachen Bogens und einer einfachen geschlos-
senen Kurve zu vermeiden. In dieser modifizierten Gestalt, besteht die
topologische Axiomatik der Ebene aus den Axiomen (I, T), (II, T) und
aus der folgenden Gruppe der speziellen Axiome:
AXIOME (III, T)2.
(1) Der Raum ist zmammenhdngend.
(2) Der Raum ist nicht kompakt.
(3) Jeder kompakter Schnitt des Raumes ist nicht azyklisch.
(4) Jedes Teilkompaktum des Raumes, das nicht azyklisch in der Dimen-
sion 1 ist, ist ein Schnitt.
Um das Prinzip [T||Af] zu realisieren, geniigt es nun als metrische
Axiome die folgenden Axiome nehmen :
AXIOME (III, Af)2.
(1) Je zwei Punkte liegen auf einer Geraden.
(2) Vierpunktebedingung.
Zum Schluss dieses Vortrages mochte ich einige Bemerkungen all-
gemeines Natur hinzufiigen. Das Problem der Grundlagen der Geometrie
habe ich hier als ein Fragment des allgemeinen Problems der Klassifikation
der topologischen Raume aufgefasst. Von diesem Standpunkte aus soil
man auch die hier angegebenen drei Axiomgruppen betrachten. Ich habe
schon bemerkt, dass die zweite Gruppe, die ich die Gruppe der Regel-
mdssigkeit genannt habe, hier nur provisorisch aufgestellt war. Sie soil un-
ter samtlichen topologischen Raumen eine Klasse von Raumen mit be-
184 KAROL BORSUK
senders regelmassigen Eigenschaften bestimmen. Eine voile topologische
Charakterisierung der Klasse von Polyedern bietet sehr wesentliche
Schwierigkeiten — da dadurch eine t)berbriickung des Abgrundes zwi-
schen den axiomatisch definierten abstrakten topologischen Raumen und
den durch Konstruktion definierten Figuren der element aren Geometrie
erzielt wiirde. Nur im Falle der hochstens zweidimensionalen Polyedern
ist cs neulich gclungen diese Schwierigkeiten zu uberwinden (Kosinski
[12]). Dagegen ist aiich in dem allgemeinen Falle eine axiomatische Auf-
fassung eines gewissen Teiles der topologischen Eigenschaften der Po-
lyeder sicher moglich (vgl. [4]). Es entsteht aber die Frage, wie man diesen
,, gewissen Teil" dofinicren soil. Diese Frage ist eng mit der Frage der ver-
nunftigen Klassifikation der topologischen Jnvarianten verbunden.
Seit dem Erlanger-Programm von Felix Klein, klassifiziert man ver-
schiedcne geometrische Eigenschaften vom Standpunkte der Klassen der
Abbildungen, gegeniiber denen diese Eigenschaften invariant sind. Wcnn
man die Homoomorphien durch eine allgcmeinere Klasse von gewissen
stctigcn Abbildungen & ersetzt, so untcrschcidct man untcr samtlichen to-
pologischen Eigenschaften eine engere Klasse von den, gegeniiber den zu
ft gehorenden Abbildungen in variant en Eigenschaften. Dicsc Eigenschaf-
ten wcrdcn wir ft-Invarianten nennen (\gl. [4]). t)ber die Klasse SI werden
wir nur voraussetzen, dass die Zusammensetzung zweier ihr angchorcn-
den Abbildungen wieder zu ihr gehort.
Wir werden sagen, dass zwei Raume X und Y zum dcnselben St-Typus
gehoren, wcnn sic diesclben Sl-Eigenschaften habcn. Es ist leicht zu be-
merken, dass zwei Raume X und Y dann und nur dann zu demselben
®-Typus gehoren, wenn cs zwei $- Abbildungen
/: X > Y und g: Y >X
auf auf
gibt.
Betrachten wir einige Beispiele :
1. Es bezeichne ft die Klasse samtlicher stetigen Abbildungen. Zu
den S-Invarianten gehoren dann zum Beispiel: Kompaktheit, Separabi-
litat, Zusammenhang und, fur Kompakte, auch der lokale Zusammen-
hang.
2. Viel interessanter ist der Fall, wo ft die Klasse aller sogenannten
r- Abbildungen ist. Man versteht dabei unter einer r-Abbildung eine stetige
Abbildung
/: X > Y
auf
GEOMETRIE UND TOPOLOGIE 185
fur die eine stetige rechtsseitig inverse Abbildung exist iert, dass heisst
eine Abbildung g
g: Y »X,
in
die der Bedingung fg(y) = y fur jeden Punkt y e Y geniigt. Die Klasse der
Invarianten von r- Abbildungen ist sehr reich, uncl somit weisen zwci clem-
selben r-Typus angehorende Raume eine weitreichende Ahnlichkeit ihrer
Eigenschaften auf.
Ahnlicherweise kann man die Invarianten von samtlichen offencn Ab-
bildungen, oder samtlichen stetigen Abbildungen mit endlichcn Urbild-
mengen, ocler samtlichen stetigen Abbildungen mit azyklischen Urbild-
mcngcn und so weiter betrachten. Jeder solchen Invariantcnklassc ent-
spricht die Eintcilung samtlichcr Raumc in cntsprechende St-Typen.
Da die voile topologische Charakterisierung der Klasse der Polyeder cher
hoffnungslos ist, scheint es zweckmassig zu sein, gewissc Charakterisirrung
von Polyedern voni Standpunkte von vcrschiedenen ft-Klasseii aus zu
betrachten. Vom Standjmnkte der r-Invarianten aus lassen sich die
Polyeder als metrisierbarc, separable Raume durch folgonde Bedingungen
charakterisieren :
1 . Lokale Kompaktheit.
2. Lokale Zusammenziehbarkeit.
3. In jedem Punkte eine endliche Dimension.
Somit konnen wir diese drei Eigenschaften als die Axiome der Regel-
massigkeit, vom Standpunkte der Theorie der r-Invarianten aus, be-
trachten.
In ahnlicher Weise kann man bei den topologischen Axiomen der dritten
Gruppe, anstatt der vollen topologischen Charakterisierung, eine relative
Charakterisierung, das heisst eine Charakterisierung im Sinne eines
gewissen ^-Typus verlangen. Man kann erwarten, dass eine systematische
Klassifikation der topologischen Invarianten erlauben wird, auf diesem
Wege die Lage der klassischen Raume unter den allgemeinen topolo-
gischen Raumen klar zu bestimmen.
186 KAROL BORSUK
Bibliographic
1 1] BING, R. H., Partitioning a set. Bulletin of the American Mathematical
Society, Bd. 55 (1949), S. 1101-1110.
[2] BIKKIIOFF, Garrett, Metric foundations of geometry I. Transactions of the Ame-
rican Mathematical Society, Bd. 55 (1944), S. 465-492.
[3] BLUMENTHAL, L., Theory and applications of distance geometry. Oxford 1953,
S. 1-347.
[4] BORSUK, K., On the topologv of retracts. Annals of Mathematics, Bd. 48 (1947),
S. 1082-1094.
[5] , Sur I 'elimination de phenomenes paradoxaux en topologie generate.
Piocecdings of the International Congress of Mathematicians, Band I, Am-
sterdam 1954, S. 1-12.
[6] FRECIUCT, M., Les espaces abstraits. Pans 1928, S. Xf + 296.
[7] FRKUDENTHAL, H., Neuere Fassungen des Riemann-Helmholz-Lieschen Raum-
problems. Mathcmatischc Zcitschnft, Bd. 63 (1956), S. 374-405.
[8] HELMHOLTZ, H., Ober die tatsdchliche Grundlagen der Geometrie. Wissenschaft-
liche Abhandlungen, Bd. II (1883), S. 610-617.
[9] HIT.BERT, D., Grundlagen der Geometrie, Leipzig 1900.
[10] KAMPEN VAN, E. R., On some characterization of 2-dimensional manifolds.
Duke Mathematical Journal, Bd. 1 (1935), S. 74.
[11] KOLMOGOROFF, A., Zur topologisch-gruppentheoretischen Begriindung der Geo-
metrie. Nachrichtcn von der Gesellschaft der Wissenschaften zu Gottingen,
Mathematisch-Physikalische Klasse (1930), S. 208-210.
[12] KOSINSKI, A., A topological characterization of 2-polytopes. Bulletin de 1'Aca-
demie Polonaise des Sciences, Cl. Ill, Bd. II (1954), S. 321-323.
[13] KURATOWSKI, C., Sur V operation A de V Analysis Situs. Funclamcnta Mathe-
maticae, Bd. 3 (1922), S. 182-199.
[14] , Topologie I, Monografie Matematyczne, Warszawa 1952, S. XI -f- 450.
[15] LIE, S., Ober die Grundlagen der Geometrie. Gesammelte Abhandlungen II
(1922), S. 380-468.
[16] LINDENBAUM, A., Contribution a V etude de I'espace metrique I. Fundamenta
Mathematicae, Bd. 8 (1926), S. 209-222.
[17] MKNGER, K., Untersuchungen uber allgemeine Metrik. Mathematische An-
nalen, Bd. 100 (1928), S. 75-163.
[18] , Geometrie generate. Memorial des Sciences Math6matiques. Bd. 124,
Paris 1954, S. 1-80.
[19] TANAKA, Tadashi and TOMINAGA, Akira. Convex ification of locally connected
generalized continua. Journal of Science of the Hiroshima University, Bd. 19
(1955), S. 301-306.
[20] TITS, J., Etude de certains espaces metriques. Bulletin de la Societe Math6ma-
tique de Belgiquc (1953), S. 44-52.
[21] , Sur un article precedent: Etudes de certaines espaces metriques. Bulletin
de la Societe Math6matique de Belgique (1953), S. 124-125.
[22] TOMINAGA, Akira. On some properties of non-compact Peano spaces. Journal of
Science of the Hiroshima University, Bd. 19 (1956), S. 457-467.
GEOMETRIE UND TOPOLOGIE 187
[23] URYSOHN, P., Zum Metrisationsproblem. Mathematischc Aimalen, Bd. 94
(1925), S. 309-315.
[24] VIICTOKIS, L,., Ober den hbheren Zusammenhang kompakter Rdunie und eine
Klasse von znsammenhangstreuen Abbildungen. Mathematischc Annalcn, Bd.
97 (1927), S. 454-472.
[25] WANG, H. C., Two-point homogeneous spaces. Annals of Mathematics, Bd.
55(1952), S. 177-191.
[26] WILSON, W. A., A relation between metric and euclidean spaces. American
Journal of Mathematics, Bd. 54 (1932), S. 505-517.
Symposium on the Axiomatic Method
LATTICE-THEORETIC APPROACH TO PROJECTIVE
AND AFFINE GEOMETRY
BJARNI JONSSON
University of Minnesota, Minneapolis, Minnesota, U.S.A.
The results that we arc going to discuss are due to several authors. The
earliest work along these lines was done by Menger in the late twenties.
lie was joined a few years later by von Neumann and Birkhoff. A large
number of more recent contributions can be found in the papers listed in
bibliography; we shall in particular make use of results due to Frink and
Schutzenberger on project ive geometry, and by Croisot, Maecla, Sasaki
and Wilcox on affine geometry and its generalizations. The bibliography
includes a number of papers that are not concerned directly with geome-
try, but in which at least some of the ideas and methods were suggested
by the investigations of geometric lattices.
1. Concepts from lattice theory. A lattice can be defined as a partially
ordered set in which any two elements have a least upper bound and a
greatest lower bound. We shall use < for the partially ordering relation
and write % + y and xy for the least upper bound, or sum, and the
greatest lower bound, or product, of two elements x and y. Most of our
lattices will be complete, i.e., an y system of elements xi, i e I, will have a
least upper bound and a greatest lower bound
]£ Xi and JJ xt.
iel ie/
In any complete lattice there exist a zero element 0 and a unit element 1
such that 0 < x < 1 for every lattice element x. Even when we consider
lattices that are not complete we shall always assume that they have a
zero element and a unit element. A lattice is said to be complemented if for
any element x there exists an element y such that x + y = 1 and xy — 0.
If, for any elements a, b, and x with a < x < b there exists an element y
such that x + y = b and xy =^ a, then the lattice is said to be relatively
complemented. Clearly every relatively complemented lattice (with a zero
element and a unit element) is complemented.
An element a is said to cover an element b if b < a and if there exists
188
LATTICE THEORY AND GEOMETRY 189
no element x such that b < x < a. An element that covers 0 is called an
atom, and an element covered by 1 is called a dual atom. A lattice in which
every element is a sum of atoms is said to be atomistic. A system of ele-
ments %i, i e /, in a complete lattice is said to be independent if
teJ ieK
whenever / and K are disjoint subsets of /.
We are primarily interested in lattices that are not distributive, but
certain special cases of the distributive law will play an important role.
A complete lattice is said to be continuous l if the equation
« 2 x* = 2 axi
iel iel
holds whenever the set {xt\i e 1} is directed. (A partially ordered set is
said to be directed if any two elements of the set have an upper bound
that also belongs to the set.) Two elements b and c are said to form a
modular pair — in symbols M(b, c) — if
(x + b)c = x + be whenever x < c.
If this holds for any two elements b and c, then the lattice is said to be
modular. If the relation M is symmetric, i.e., if for any two elements b
and c the conditions M(b, c) and M(c, b) are equivalent, then the lattice is
said to be semi-modular. Finnally, a lattice is said to be special if any two
elements that are not disjoint form a modular pair, i.e., if the condition
M(b, c) holds whenever be /- 0.
2. Geometries and geometric lattices. It is convenient for our purpose to
take as the undefined concepts of geometry the set consisting of all the
points and the function which associates with every set of points the sub-
space which it spans. Thus we introduce :
DEFINITION 2.1. By a GEOMETRY we mean an ordered pair <S, C>
consisting of a set S and a function C which associates with every subset X
of S another subset C(X) of S in such a way that the following conditions are
1 Such lattices arc sometimes called upper continuous, but since the dual concept
of a lower continuous lattice will not be needed here, no confusion will be caused by
the present terminology.
190 BJARNI JONSSON
satisfied:
(i) X C C(X) -• C(C(X)) for every subset X of S.
(ii) C(p) = p for every peS*
(iii) C(« = <£ 3
(iv) For every subset X of S, C(X) is the union of all sets of the form C(Y)
with Y a finite subset of X.
DEFINITION 2.2. Suppose <S, C> is a geometry.
(i) An element of S is called a POINT of <S, C>.
(ii) ^4 set of the form C(X) with X C S is called a SUBSPACE of <5, C>. //
Y — C(X), /A^w Y is said to be SPANNED by X.
(iii) A subspace of <S, C> is s0«2 to be W-DIMENSIONAL if it is spanned by
a set with n + 1 elements but is not spanned by any set with fewer
than n + 1 elements.
(iv) By a LINE and a PLANE of <5, C> 100 mean, respectively, a one di-
mensional and a two dimensional subspace of <S, C>.
From 2.1(iv) it follows that if X and Y are subsets of 5, and iiXCY,
then C(X) CC(Y). Together with 2.1(i)-(iii) this yields:
THEOREM 2.3. The family <$/ of all subspaces of a geometry (S, C> has
the following properties :
(i) S and </> are members of ,$# '.
(ii) Every one-element subset of S is a member of s# .
(iii) The intersection of .any number (finite or infinite) of sets belonging to
,«/ is a member of ,tV.
The tieup between geometries and lattices is now easily established. In
fact, if a family efl/ of subsets of a set S has the properties 2.3(i)-(iii), then
.$/ is a complete and atomistic lattice under set-inclusion. The lattice
product of any system Xi, i e /, of sets belonging to J/ is their set-
theoretic intersection, and the lattice sum of the sets Xi is the smallest
member of stf which contains their union. The atoms of j/ are the one-
element subsets of 5. Conversely, any complete and atomistic lattice A is
isomorphic to a family j/ consisting of subsets of some set 5 and satis-
fying 2.3(i)-(iii). In fact, we may take for S the set of all atoms of A and
2 Strictly speaking C({p}) ={/>}. We shall also write C(p,q), C(p,q,r), ...,
C(X, p). C(X, p, q], ... for C({p, q}), C({p, q, r}) C(A'w {/>}). C(Xv{p, q})
3 ^ is the empty set.
LATTICE THEORY AND GEOMETRY 191
correlate with each element xoiA the set consisting of all the atoms poiA
for which p < x. This leads to
DEFINITION 2.4. A lattice is said to be GEOMETRIC if it is isomorphic
to the lattice of all sub spaces of some geometry.
Each of the next two theorems gives an axiomatic characterization of
geometric lattices. The first is an immediate consequence of the definitions
involved.
THEOREM 2.5. A lattice is geometric if and only if it is complete and
atomistic, and has the property that for any atom p and any systems of atoms
qi, i E I, the condition
implies that there exists a finite subset J of I such that
THEOREM 2.6. A lattice is geometric if and only if it is complete, atom-
istic and continuous.
3. The exchange property. Our notion of a geometry is an extremely
general one and cannot be expected to have many interesting conse-
quences. It may for instance happen that two distinct lines have more
than one point in common, and in fact it is easy to construct geometries
where one line is properly contained in another. We now consider a
condition which excludes such pathological situations.
DEFINITION 3.1. A geometry <5, C> is said to have the EXCHANGE
PROPERTY if , for any points p and q and any subset X of S, the conditions
p e. C(X, q) and p $ C(X) jointly imply that q e C(X, p).
DEFINITION 3.2. By a MATROID LATTICE we mean a geometric lattice
with the property that, for any atoms p and q and any element x, the conditions
p < q + x and p .<f. x jointly imply that q <p + x.
THEOREM 3.3. In order for a lattice A to be isomorphic to the lattice of
all subspaces of a geometry which has the exchange property it is necessary and
sufficient that A be a matroid lattice.
THEOREM 3.4. Every matroid lattice is relatively complemented.
192 BJARNI JONSSON
Matroid lattices have been extensively investigated. Of the numerous
equivalent characterizations of this class of lattices, the one given in the
next theorem is particularly interesting.
THEOREM 3.5. In order for a lattice A to be a matroid lattice it is neces-
sary and sufficient that A be complete, atomistic, continuous and semi-
modular.
THEOREM 3.6. In any matroid lattice the following conditions hold for
all elements a, b, c, d, and all atoms p, q, pQ, p\, . . ., pn\
(i) Ifa<a + p<a-\-q, then a + p = a + q.
(ii) // ap = 0, then a + p covers a.
(iii) // (a + b)p = 0, then (a + p}b = ab
(iv) If (po + pi + ... + pk-i)pk = 0 for k= I, 2, ..., n, then the
system Pi, i = 0, 1 , . . . , n, is independent.
(v) // a and b cover ab, then a + b covers a and b.
(vi) // a covers ab, then a + b covers b.
(vii) // b covers be, then M(b,c).
(viii) // be < a < c < b + c, then there exists an element x such that
be < x < b and a = (a + x)c.
(ix) // be < a < c < b + c, then there exists an element x such that
be < x < b and (a + x)c < c.
(x) // be < a < c < a + b, then there exists an element x such that
be < x <b and a — (a + x)c.
Conversely, any geometric lattice which satisfies one of the conditions
(i)-(x) is a matroid lattice.
4. Strongly planar geometries. In the classical approach to affine
geometry, a set X of points is by definition a subspace if and only if it
contains every line with which it has two distinct points in common, and
contains every plane with which it has three non-collinear points in com-
mon. In projective geometry the first of these two conditions alone is
taken as the characteristic property of a subspace. Thus it is true in
either case that if C(p, q, r) C X whenever p, q, r e X, then X is a sub-
space. Another property common to the classical affine and projective
geometries is the fact that two intersecting planes which are contained in
the same 3-space have a line in common. This motivates the next two
definitions.
LATTICE THEORY AND GEOMETRY 193
DEFINITION 4.1. A geometry <5, C> is said to be PLANAR if it has the
exchange property and, for every subset X of S, the condition
C(p, q, r) C X whenever p, q, r e X
implies that X is a subspace of <S, C>.
DEFINITION 4.2. A geometry <5, C> is said to be STRONGLY PLANAR if
it is planar and has the property that any two distinct planes that are con-
tained in the same 3-space are either disjoint or else their intersection is a line.
No simple condition is known which characterizes those lattices which
correspond to planar geometries. As regards strongly planar geometries
we have:
THEOREM 4.3. A geometry <S, C> is strongly planar if and only if it has
the exchange property and, for any points p, q, r, and any set X of points,
the conditions
peC(X,q), reC(X)
jointly imply that there exists a point s such that
peC(q,r,s) and seC(X).
THEOREM 4.4. For any matroid lattices A the following conditions are
equivalent :
(i) A is isomorphic to the lattice of all subspaces of a strongly planar
geometry.
(ii) For any atoms p, q, r of A and any element a of A, the conditions
p < q + a and r < a
jointly imply that there exists an atom s such that
p < q + r + s and s < a.
(iii) A is special.
(iv) For any element a of A and any dual atom h of A , if 0 < ah < a,
then a covers ah.
5. Projective geometries. In order to obtain a concept that corresponds
more or less to the classical notion of a projective geometry we need
axioms to the effect that any two lines in the same plane have a point in
common, and that if a set X of points has the property that it contains
every line with which it has two points in common, then X is a subspace.
These two conditions can be stated as a single axiom :
194 BJARNI JONSSON
DEFINITION 5.1. A geometry <5, C> is said to be PROJECTIVE if it has
the exchange property and, for any points p and q and any set X of points,
the conditions
p e C(X, q) and p ^ q
jointly imply that there exist a point r such that
peC(q,r) and r
DEFINITION 5.2. A lattice is said to be PROJECTIVE if and only if it is
isomorphic to the lattice of all subspaces of a protective geometry.
COROLLARY 5.3. Every projective geometry is strongly planar.
COROLLARY 5.4. Suppose p is a point and X and Y are sets of points
of a protective geometry <5, C>. //
p e C(X, Y), p $ C(X) and p $ C(Y),
then there exist points q and r such that
p e C(q, r), q e C(X) and r e C(Y).
With the aid of these two corollaries we get a particularly elegant
characterization of projective lattices:
THEOREM 5.5. A lattice is projective if and only if it is complete,
atomistic, continuous and modular.
The notion of a projective geometry as defined here is obviously more
general than the classical concept, since we put no restriction on the
dimension and do not exclude geometries in which there are degenerate
lines consisting of only two distinct points. However, this generalization
is less radical than it might appear at first glance. Every projective lattice
A is a direct product of indecomposable sublattices,
A =U At.
i&l
When applied to the lattice of all subspaces of a projective geometry
<S, C>, this decomposition corresponds to a partitioning of S into sub-
spaces Si in such a way that two distinct points belong to the same sub-
space if and only if they determine a non-degenerate line. Some of these
components may be trivial, consisting of just one point or of just one line,
and others may be non-Arguesian planes. With these exceptions, we can
associate with each component 5| a division ring and introduce coordinates
LATTICE THEORY AND GEOMETRY 195
in the manner of classical geometry, with the sole difference that the num-
ber of coordinates may be infinite.
This brings us to the subject of Desargues' Law:
DEFINITION 5.6. A geometry <S, C> is said to be ARGUESIAN if it is
protective and, for any points pQ, pi, p2, q$, q\, q2, the condition
implies that
C(pi, p2) " C(qi, q2) C C((C(p0pi) o C(qQ, qi)) w (C(p0, p2) ^ C(qQ,
DEFINITION 5.7. A lattice is said to be ARGUESIAN if and only if it is
isomorphic to the lattice of all subspaces of an Arguesian geometry.
The formulation of Desargues' Law in Definition 5.6 differs from the
classical version in that no restriction is placed on the six points involved
(such as that they be distinct, or that the three pairs PI , qi, i = 0, 1,2 lie
on three distinct but concurrent lines). However, the two formulations
are actually equivalent, for some of the special cases that are normally
excluded are actually valid in all projective geometries, while the re-
maining cases follow from the classical Desargues' Law.
It is of course easy to write down a lattice-theoretic version of Desar-
gues' Law, involving six atoms (5.8(ii)). It is an interesting fact that this
condition actually holds with the six atoms replaced by any six lattice
elements (5.8(iii)). Perhaps more important, however, is the fact that this
condition is actually equivalent to a lattice identity (5.8(iv)).
THEOREM 5.8. // A is a geometric lattice, then the following conditions
are equivalent:
(i) A is Arguesian
(ii) A is modular and, for any atoms ,po, pi, p2, qo, qi, q2 of A, the con-
dition
(Pi + qi)(Pz + £2) < po + qo
implies that
(pi + pz)(qi + qz) < (pQ + pi)(qv + ?i) + (pQ + pz)(qQ + qz)
(iii) For any elements ao, a\, a2, bo, bi, b2eA, the condition
(ai + bi)(a2 + b2) < aQ + bQ
implies that
(ai + a2)(bi + b2) ^ (a0 + <*i)(&o + &i) + (*o + a2)(60 + b2).
196 BJARNI JONSSON
(iv) For any elements ao, a\, a^ &o, &i, b%G At if
+b2)[(aQ + ai)(ft0 + 61) + (a0
4- 62) < «o(«i + y) + M*i 4- y)>
Observe that in (iii) and (iv) we do not assume the modular law; it
turns out to be a consequence of the given conditions. 4 In terms of the
decomposition discussed above, a projective lattice A is Arguesian if and
only if none of its indecomposable factors is isomorphic to the lattice of
all subspaces of a non- Arguesian projective plane.
6. Affine geometries. We define an affine geometry to be a strongly
planar geometry in which Euclid's parallel axiom holds :
DEFINITION 6.1. A geometry <5, C> is said to be AFFINE if and only if
<S, C> is sir only planar and has the following property: For any plane P,
line L, and point p, the conditions
p e P, L C P and p $ L
jointly imply that there exists a unique line L' such that
pzL'.L'CP and L n L' = <f>.
DEFINITION 6.2. A lattice is said to be AFFINE if and only if it is iso-
morphic to the lattice of all subspaces of an affine geometry.
THEOREM 6.3. A lattice A is affine if and only if it is a special matroid
lattice with the following property : For any atoms p, q, and r, if
p <p + q <p + q + r,
then there exists a unique element x such that
r<x<p + q + r and (p + q)x = 0
The relation between our concepts of an affine geometry and of a non-
degenerate projective geometry is precisely analogous to the relation
between their classical counterparts :
4 In fact, in any lattice A, (iv) implies (iii) and (iii) in turn implies that A is
modular.
LATTICE THEORY AND GEOMETRY 197
THEOREM 6.4. // p is an atom of an a/fine lattice A , then the set of all
elements x £ A with p < x is an indecomposable projective lattice under the
partially ordering relation defined on A .
THEOREM 6.5. // h is a dual atom of an indecomposable projective lattice
A , then the set A h consisting of 0 and of all elements x £ A with x i h is
an a/fine lattice under the partially ordering relation on A. Conversely, for
any a/fine lattice B there exist an indecomposable projective lattice A and
a dual atom h of A such that 13 is isomorphic to A^.
7. Applications of geometry to lattice theory. As can be seen from the
above discussion, the applications of lattice theory to the axiomatization
of geometry have yielded radically different and quite simple character-
izations of the geometries considered. Similar work has been done with
other types of geometries, and it is quite certain that more can be clone
along these lines.
But these investigations have also aided, both directly and indirectly,
in the study of certain problems in lattice theory. We shall mention
briefly some examples that illustrate this point.
Modular lattices may be regarded as a generalization of projective
geometry. Since every projective lattice is complemented, it might be
more reasonable to consider only complemented modular lattices. Just
how far-reaching a generalization is this? A partial answer was provided
by von Neumann, who showed that every complemented modular lattice
which satisfies certain conditions (namely, possesses an w-frame with
n > 4) is isomorphic to the lattice of all principal left ideals of a regular
ring. Since a full matrix ring over a division ring is regular, this may be
regarded as a generalization of the coordinatization theorem for non-
Arguesian geometries. There is however an important problem open here :
To find a condition that is both necessary and sufficient in order for a
complemented modular lattice to be isomorphic to the lattice of all
principal left ideals of a regular ring.
A representation of a different kind was obtained by Frink, who proved
that every complemented modular lattice B is a sublattice of a projective
lattice A . The Frink geometry associated with B is a generalization of the
Stone space of a Boolean algebra; its points are the maximal proper dual
ideals of B, and the line through two points P and Q consists of all points
R such that P r\ Q C R. The subspace correlated with a given element
x e B is the set of all points P such that x e P. It is known that this
198 BJARNI JONSSON
embedding preserves all identities that hold in B, and from this it follows
in particular that A is Arguesian if and only if B satisfies the condition
(iv) of Theorem 5.8.
Still other investigations, by Bear and Inaba, that were inspired by the
coordinatization theorem, concern the lattice of all submodules of a
module over a ring of a certain type and, as a special case, the lattices of
all subgroups of a finite Abelian group. The principal result may be
regarded as a representation theorem for a certain class of modular
lattices, including all the finite dimensional projective lattices.
Even for modular lattices that are not complemented, Desargues' Law
in the form of the condition (iv) of Theorem 5.8 turns out to be significant.
Most of the modular lattices that arise in applications of lattice theory are
isomorphic to lattices of commuting equivalence relations, and in fact
all the known examples for which this is not the case are of a somewhat
pathological character. It is therefore natural to try to characterize
axiomatically the class of all those lattices for which such a representation
exists. It is not hard to prove that Desargues' Law is a necessary con-
dition, but it is still an open question whether this is also sufficient. On the
other hand, an infinite system of axioms (in the form of conditional
equations) is known, which is sufficient as well as necessary, and these
axioms are such that when applied to the lattice of all subspaces of a
projective geometry, they reduce to certain configuration theorems which
are valid in all Arguesian geometries.
The family of all equivalence relations over a set U, or equivalently the
family of all partitions of U, is a geometric lattice. The class consisting of
all lattices of this form (and of their isomorphic images) can be convenient-
ly characterized by describing the corresponding geometries. In fact, in
order for the lattice of all subspaces of a geometry <S, C> to be isomorphic
to the lattice of all equivalence relations over some set, it is necessary and
sufficient that the following conditions be satisfied:
(1) <5, C> is planar and has the exchange property.
(2) Each plane of <S, C> has either 3, 4, or 6 points.
(3) For each line L of <5, C>, either L has exactly two points, and there
are exactly two lines parallel to L, or else L has exactly three points and
there is no line parallel to L.
In Theorem 6.5, affine lattices are characterized as those lattices which
can be obtained from indecomposable projective lattices by removing all the
elements x contained in some fixed dual atom h, with the exception of the
zero element. If h is not a dual atom, this process still leads to a special
LATTICE THEORY AND GEOMETRY 199
matroid lattice, but only half of Euclid's parallel axiom will be satisfied
(the uniqueness part). The question of what lattices can be obtained from
complemented modular lattices by removing more general sets, subject to
some suitable conditions, has been studied by Wilcox. Some of his results
have been announced in abstracts, but a detailed account has not yet
appeared.
These examples will suffice to illustrate the fact that the investigations
of the connections between geometries and lattices have yielded something
of interest to both subjects.
Bibliography
AMEMIYA, I. On the representation of complemented modular lattices. Journal of the
Mathematical Society of Japan, vol. 9 (1957), pp. 263-279.
BAER, R., A unified theory of projective spaces and finite abelian groups. Transactions
of the American Mathematical Society, vol. 52 (1942), pp. 283-343.
, Linear algebra and projective geometry. New York, 1952, VIII -(-318 pp.
BIRKHOFK, G., Abstract linear dependence and lattices. American Journal of Mathe-
matics, vol. 57 (1935), pp. 800-804.
, Combinatoyy relations in projective geometry. Annals of Mathematics (2), vol.
36 (1935), pp. 743-748.
, Lattice theory. New York 1948, XIII + 283 pp.
, Metric foundations of geometry T. Transactions of the American Mathematical
Society, vol. 55 (1944), pp. 465-492.
, and FKINK, O., Representations of lattices by sets. Transactions of the American
Mathematical Society, vol. 64 (1948), pp. 299-316.
and VON NEUMANN, J., The logic of quantum mechanics. Annals of Mathematics
(2), vol. 37 (1936), pp. 823-843.
CROISOT, R., Axiomatique des treillis semi-modulaires . Comptes Rendus Hebdoma-
daires des Seances clc I'Acad&nie des Sciences (Paris), vol. 231 (1950), pp. 12—
14.
, Contribution a I'etude des treillis semi-modulaires de longueur infinie. Annales
Scieiitifiques de 1'Ecole Normale Supcrieure (3), vol. 68 (1951), pp. 203-265.
, Diverse caracterisations des treillis semi-modulaires, modulaires et distributifs.
Comptes Rendus Hebdomadaires des Stances de l'Acad6mie des Sciences
(Paris), vol. 231 (1950), pp. 1399-1401.
, Quelques applications et proprietes des treillis semi-modulaires de longueur in-
finie. Annales de la Facultd des Sciences de l'Universit6 de Toulouse pour les
Sciences Math6matiques et les Sciences Physiques (4), vol. 16 (1952) pp. 1 1-74.
, Sous-treillis, produit cardinaux et treillis homomorphes des treillis semi-modu-
200 BJARNI JONSSON
laires. Comptes Rendus Hebdomadaires des Stances de l'Acad6mie des Sciences
(Paris), vol. 232 (1951), pp. 27-29.
DILWORTH, R. P., Dependence relations in a semi-modular lattice. Duke Mathematical
Journal, vol. 11 (1944), pp. 575-587.
, Ideals in Birkhoff lattices. Transactions of the American Mathematical
Society, vol. 49 (1941), pp. 325-353.
, Note on complemented modular lattices. Bulletin of the American Mathematical
Society, vol. 46 (1940), pp. 74-76.
, The arithmetic theory of Birkhoff lattices. Duke Mathematical Journal, vol. 8
(1941), pp. 286-299.
DUBREIL-JACOTIN, M. L., LESiEUR, L. and CROISOT, R., Lemons SUY la theorie des
treillis des structures algebriques ordonnees et des treillis geometriques. Paris 1953,
VIII f 385 pp.
FICKEN, F. A., Cones and vector spaces. American Mathematical Monthly, vol. 47
(1940), pp. 530-533.
FRINK, ()., Jr., Complemented modular lattices and projective spaces of infinite
dimension. Transactions of the American Mathematical Society, vol. 60 (1946),
pp. 452-467.
FRYER, K. D. and HALPEKIN, I., Coordinates in geometry. Transactions of the Royal
Society of Canada, vol. 48 (1954), pp. 11-26.
, and , On the coordinatization theorem of J. von Neumann. Canadian
Journal of Mathematics, vol. 7 (1955), pp. 432-444.
and , The von Neumann coordinatization theorem for complemented modular
lattices. Acta Universitates Szcgediensis. Acta Scientiarum Mathematicarum,
vol. 16 (1956), pp. 203-249.
HALL, M. and DILWORTH, R. P., The imbedding problem for modular lattices.
Annals of Mathematics (2), vol. 45 (1944), pp. 450-456.
HALPERIN, I., Addivity and continuity of perspectivity. Duke Mathematical Journal,
vol. 5 (1939), pp. 503-511.
, Dimensionality in reducible geometries. Annals of Mathematics (2), vol. 40
(1939), pp. 581-599.
, On the transitivity of perspectivity in continuous geometries. Transactions of
the American Mathematical Society, vol. 44 (1938), pp. 537-562.
Hsu, C., On lattice theoretic characterization of the parallelism in a/fine geometry.
Annals of Mathematics (2) vol. 50 (1949), pp. 1-7.
IN ABA, E., On primary lattices. Journal of the Faculty of Science, Hokkaido Uni-
versity, vol. 11 (1948), pp. 39-107.
, Some remarks on primary lattices. Natural Science Report of the Ochanomizu
University, vol. 2 (1951), pp. 1-5.
IWAMURA, T., On continuous geometries. I. Japanese Journal of Mathematics, vol.
19 (1944), pp. 57-71.
, On continuous geometries. II. Journal of the Mathematical Society of Japan.
vol. 2 (1950), pp. 148-164.
IZUMI, S., Lattice theoretic foundation of circle geometry. Proceedings of the Imperial
Academy (Tokyo), vol. 16 (1940), pp. 515-517.
J6NSSON, B., Modular lattices and Desargues' theorem. Mathematica Scandinavica,
vol. 2 (1954), pp. 295-314.
LATTICE THEORY AND GEOMETRY 201
, On the representation of lattices. Mathematica Scandinavica, vol. 1 (1953), pp.
193-206.
KAPLANSKY, I., Any orthocomplemented complete modular lattice is a continuous
geometry. Annals of Mathematics (2), vol. 61 (1955), pp. 524-541.
KODAIRA, K. and HURUYA, S., On continuous geometries I, II, III (in Japanese).
Zenkoku Shijo Sukaku Panwakai, vol. 168 (1938), pp. 514-531 ; vol. 169 (1938),
pp. 593-609; vol. 170 (1938), pp. 638-656.
KOTHE, G., Die Theorie der Verbtinde, ein neuer Versuch zuv Grundlegung der Algebra
und der projectiven Geometric. Jahresbericht der Deulschcn Mathcmatiker Ver-
cinigung, vol. 47 (1937), ])p. 125-144.
KRISHNAN, V. S., Partially ordered sets and protective geometry. The Mathematics
Student, vol. 12 (1944), pp. 7- 14.
LOOMIS, L. H., The lattice theoretic background of the dimension theory of operator
algebras. Memoirs oi the American Mathematical Society 1955, No. 18, 36 pp.
MACLANE, S., A lattice formulation for transcendence degrees and p-bases. Duke
Mathematical Journal, vol. 4 (1938), pp. 455-468.
MAKDA, ¥., A lattice formulation for algebraic and transcendental extensions in
abstract algebras. Journal of Science of the Hiroshima University, vol. 16
(1952-1953), pp. 383-397.
- — , Continuous geometry (in Japanese). Tokyo 1952, 2 -|- 3 | 225pp.
, Dimension functions on certain general lattices. Journal of Science of the Hiro-
shima University, vol. 19 (1955), pp. 211-237.
, Dimension lattice of reducible geometries. Journal of Science of the Hiroshima
University, vol. 13 (1944), pp. 11-40.
, Direct sums and normal ideals of lattices. Journal of Science of the Hiroshima
University, vol. 14 (1949-1950), pp. 85-92.
— - -, Kmbedding theorem of continuous regular rings. Journal of Science of the
Hiroshima University, vol. 14 (1949-1950), pp. 1-7.
-, Lattice theoretic characterization of abstract geometries. Journal of Science of the
Hiroshima University, vol. 15 (1951-1952), pp. 87-96.
— , Matroid lattices of infinite length. Journal of Science of the Hiroshima Uni-
versity, vol. 15 (1951-1952), pp. 177-182.
— , Representations of orthocomplemented modular lattices. Journal of Science of the
Hiroshima University, vol. 14 (1949-1950), pp. 93-96.
— , The center of lattices. (In Japanese). Journal of Science of the Hiroshima
University, vol. 12 (1942), pp. 11-15.
MENGER, K., Algebra der Geometrie (Zur Axiomatik der projectiven Verkniipfungs-
beziehungen] . Ergebnisse eincs mathematischen Kolloquiums, vol. 7 (1936), pp.
11-12.
— — , Axiomatique simplifiee de I'algebre de la geometric protective. Comptes Rendus
Hebdomadaires des Seances de 1' Academic des Sciences (Paris), vol. 206 (1938),
pp. 308-310.
— , Bemerkungen zu Grundlagenfragen IV. Axiomatik der endlichen Mengen und
der elementargeometrischen Verknupfungsbeziehungen. Jahresbericht der Deut-
schen Mathcmatiker Vcreinigung, vol. 37 (1928), pp. 309-325.
, La geome'trie axiomatique de I'espace projectif, Comptes Rendus Hebdomadaires
des Seances de 1'Academie des Sciences (Paris), vol. 228 (1949), pp. 1273-1274.
202 BJARNI JONSSON
, New foundations of protective and affine geometry. Algebra of geometry. Annals
of Mathematics (2), vol. 37 (1936), pp. 456-482.
, Non-Euclidean geometry of joining and intersecting. Bulletin of the American
Mathematical Society, vol. 44 (1938), pp. 821-824.
, On algebra of geometry and recent progess in non-liuclidean geometry. The Rice
Institute Pamphlets, vol. 27 (1940), pp. 41-79.
, Selfdual postulates in pyojective geometry. American Mathematical Monthly.
vol. 55 (1948), p. 195.
MOUSINHO, M. L., Modular and protective lattices. Summa Brasiliensis Mathemati-
cac, vol. 2 (1950), pp. 95-112.
VON NEUMANN, J., Algebraic theory of continuous geometries. Proceedings of the
National Academy of Science, U.S.A., vol. 23 (1937), pp. 16-22.
, Continuous geometry. Proceedings of the National Academy of Science, U.S.A.,
vol. 22 (1936), pp. 92-100.
, Continuous rings and their arithmetics. Proceedings of the National Academy
of Science, U.S.A., vol. 23 (1937), pp. 341-349. Errata, ibid., p. 593.
, Examples of continuous geometries, Proceedings of the National Academy of
Science, U.S.A. vol. 22 (1936), pp. 101-108.
, Lectures on continuous geometries, I— III, Princeton 1936-1937. (Mimeographed
lecture notes.)
, On regular rings. Proceedings of the National Academy of Science, U.S.A. vol.
22 (1936), pp. 707-712.
, and HALPERIN, J., On the transivity of perspective mappings. Annals of Mathe-
matics (2), vol. 41 (1940), pp. 87-93.
PRENOWITZ, WALTER, Total lattices of convex sets and of linear spaces. Annals of
Mathematics (2), vol. 49 (1948), pp. 659-688.
SASAKI, U., Lattice theoretical characterization of an affine geometry of arbitrary
dimension. Journal of Science of the Hiroshima University, vol. 16 (1952—1953),
pp. 223-238.
, Lattice theoretic characterization of geometries satisfying "Axiome der Ver-
kniipfung". Journal of Science of the Hiroshima University, vol. 16 (1952-
1953), pp. 417-423.
, Orthocomplemented lattices satisfying the exchange axiom. Journal of Science of
the Hiroshima University, vol. 17 (1953-1954), pp. 293-302.
, Semi-modularity in relatively atomic, upper continuous lattices. Journal of
Science of the Hiroshima University, vol. 16 (1952-1953), pp. 409-416.
, and FUJIWARA, S., The characterization of partition lattices. Journal of Science
of the Hiroshima University, vol. 15 (1951-1952), pp. 189-201.
and , The decomposition of matroid lattices. Journal of Sciences of the
Hiroshima University, vol. 15 (1951-1952), pp. 183-188.
SCHUTZENBERGER, M., Sur certains axiomes de la thdorie des structures. Comptes
Rendus Hebdomadaire dcs S6ances de I'Acad&nic des Sciences (Paris), vol.
221 (1945), pp. 218-220.
WHITNEY, H., On the abstract properties of linear dependence. American Journal of
Mathematics, vol. 57 (1935), pp. 509-533.
WILCOX, L. R., An imbedding theorem for semi-modular lattices. Bulletin of the
American Mathematical Society, vol. 60 (1954), p. 532.
LATTICE THEORY AND GEOMETRY 203
— , Modular extensions of semi-modular lattices. Bulletin of the American Mathe-
matical Society, vol. 61 (1955), pp. 524-525.
-, Modularity in Birkhoff lattices. Bulletin of the American Mathematical
Society, vol. 50 (1944), pp. 135-138.
-, Modularity in the theory of lattices. Annals of Mathematics (2), vol. 40 (1939),
pp. 490-505.
Symposium on the Axiomatic Method
CONVENTIONALISM IN GEOMETRY *
ADOLF GRONBAUM
Lehigh University, Bethlehem, Pennsylvania, U.S.A.
1 . Introduction. In what sense and to what extent can the ascription
of a particular metric geometry to physical space be held to have an em-
pirical warrant ? To answer this question we must inquire whether and
how empirical facts function restrictively so as to support a unique metric
geometry as the true description of physical space.
The inquiry is prompted by the conflict of ideas on this issue emerging
in the Albert Einstein volume in Schilpp's Library of Living Philosophers
between Robertson, Rcichenbach and Einstein. Robertson characterizes
K. SchwarzschilcTs attempt to determine observationally the Gaussian
curvature of an astronomical 2-flat as an inspiring implementation of the
empiricist conception of physical geometry. And Robertson deems
Schwarzschild's view to be "in refreshing contrast to the pontifical
pronouncement of Henri Poincare," [25, p. 325J who had declared that
"Euclidean geometry has, . . ., nothing to fear from fresh experiments"
[20, p. 81] after reviewing the various possible results of stellar parallax
measurements. In the same volume [21, p. 297] and elsewhere [22, Ch. 8;
23, pp. 30-37J, Reichenbach maintains, as Carnap had done in his early
monograph Der Raum [3], that the question as to which metric geometry
prevails in physical space is indeed empirical but subject to an important
proviso: it becomes empirical only after a physical definition of con-
gruence for line segments has been given conventionally by stipulating (to
within a constant factor depending on the choice of unit) what length is to
be assigned to a transported solid rod in different positions of space.
Reichenbach calls this qualified empiricist conception "the relativity of
geometry" and terms "conventionalism" the more radical thesis that
even after the physical meaning of "congruent" has been fixed, it is
entirely a matter of convention which physical geometry is said to prevail.
Believing Poincare to have been an exponent of conventionalism in this
sense, Reichenbach rejects Poincare" s supposed philosophy of geometry as
* The author is indebted to the National Science Foundation of the U.S.A. for
the support of research.
204
CONVENTIONALISM IN GEOMETRY 205
erroneous. On the other hand, Einstein criticizes Reichenbach's relativity
of geometry by upholding a particular version of conventionalism which
he attributes to Poincare [9, pp. 676-679].
This exchange reveals that there are several different theses con-
cerning the presence of stipulational ingredients in physical geometry and
the warrant for their introduction which require critical examination in
the course of our inquiry.
Our main concern is with the respective roles of convention and fact in
the ascription of a particular metric geometry to physical space on the
basis of measurements with a rigid body. Accordingly, we shall discuss
in turn the two principal problems which have been posed in connection
with the formulation of the criterion of rigidity and of isochronism.
2. The Criterion of Rigidity: I. The Status of Spatial Congruence.
Differential geometry allows us to metrize a given physical surface,
say an infinite blackboard or some portion of it, in various ways so as
to acquire any metric geometry compatible with its topology. Thus,
if we have such a space and a net-work of Cartesian coordinates on it,
we can just as legitimately metrize the portion above the #-axis by means
dx2 -f dy2
of the metric ds2 = , which confers a hyperbolic geometry on
y2
that space, as by the Euclidean metric ds2 = dx2 + dy2. The geometer is
not disconcerted by the fact that in the former metrization, the lengths of
horizontal segments whose termini have the same coordinate differences
dx
dx will be ds == and will thus depend on where they are along the
y
y-axis. What is his sanction for preserving equanimity in the face of the
fact that this metrization commits him to regard a segment for which
dx = 2 at y = 2 as congruent to a segment for which dx = 1 at y — 1,
although the customary metrization would regard the length ratio of
these segments to be 2 : 1 ? His answer would be that unless one of two
segments is a subset of the other the congruence of two segments is a
matter of convention, stipulation or definition and not a factual matter
concerning which empirical findings could show one to have been
mistaken. He does not say, of course, that a transported solid rod will
coincide successively with the two hyperbolically-congrucnt segments
but allows for this non-coincidence by making the length of the transport-
ed rod a suitable function of its position rather than a constant. And in
this way, he justifies his claim that the hyperbolic metrization possesses
206 ADOLF GRUNBAUM
both epistemological and mathematical credentials as good as those of the
Euclidean one.
This conception of congruence was vigorously contested by Bertrand
Russell and defended by Poincare in a controversy which grew out of the
publication of Russell's Foundations of Geometry [28]. Our first concern
will be with the central issue of that debate.
Russell states the f actualist's argument as follows [26, pp. 687-688] l :
"It seems to be believed that since measurement is necessary to
discover equality or inequality, these cannot exist without measure-
ment. Now the proper conclusion is exactly the opposite. Whatever
one can discover by means of an operation must exist independently
of that operation : America existed before Christopher Columbus, and
two quantities of the same kind must be equal or unequal before
being measured. Any method of measurement is good or bad accord-
ing as it yields a result which is true or false. Mr. Poincare, on the
other hand, holds that measurement creates equality and inequality.
It follows [then] . . . that there is nothing left to measure and that
equality and inequality are terms devoid of meaning."
Before setting forth the grounds for regarding Russell's argument here
as untenable, it will be useful to analyze the reasoning employed in an
inadequate criticism of it. This analysis will exhibit an important facet of
the relation of the axiomatic method in pure geometry to the description
of physical space.
We are told that Russell's contention can be dismissed by simply
pointing to the theory of models: since physical geometry is a semanti-
cally-interpreted abstract calculus, the customary physical interpretation
of the abstract relation term "congruent" (for line segments) as opposed
to the kind of interpretation given in our hyperbolic metrication above
clearly cannot itself be a factual statement. Hence it is argued that the
alternative metrizability of spatial and temporal continua should never
have been either startling or a matter for dispute. On this view, Poincare
could have spared himself the trouble of polemicizing against Russell on
behalf of it in the form of a philosophical doctrine of congruence. For, so
the argument runs [7, pp. 9-10J, there can be nothing particularly problem-
atic about the physical interpretation of the term "congruent": like the
physical meaning of all other primitives of the calculus, the denotata of
the abstract relation term "congruent" (for line segments) are specified by
1 An implicit endorsement of this argument is given by H. von Helmholtz [33,
p. 15].
CONVENTIONALISM IN GEOMETRY 207
semantical rules which are fully on a par in regard to both conventionality
and importance with those furnishing the interpretation of any of the
other abstract primitives of the calculus. In fact, Tarski's axioms for
elementary Euclidean geometry, which appear in this volume, even
dispense with the primitive "congruent" for line segments and yet yield
(the elementary form of) a metric geometry by using instead a quaternary
predicate 9 denoting the equidistance relation between 4 points.
That such an argument does not go to the heart of the issue and hence
would have failed to convince Russell can be seen from the following:
The congruence relation for line segments, and correspondingly for
regions of surfaces and of 3-space, is a reflexive, symmetrical and tran-
sitive relation in these respective classes of geometrical configurations.
Thus, congruence is a kind of equality relation. Now suppose that one
believes, as Russell and Helmholtz thought they could believe justifiably,
that the spatial equality obtaining between congruent line segments
consists in their each containing the same intrinsic amount of space. Then
one will maintain that in any physico-spatial interpretation of an abstract
geometrical calculus, it is never legitimate to choose arbitrarily what
specific line segments arc going to be called "congruent". And, by
the same token, one will assert that in Tarski's aforementioned axio-
matization, it is never arbitrary what quartets of physical points are to be
regarded as the denotata of his quaternary equidistance predicate d.
Instead the imputation of an intrinsic metric to the extended continua of
space and time will issue in the following contentions: (i) since only
"truly equal" intervals may be called "congruent", Newton [18, pp. 6-8J
was right in insisting that there is only one true metrization of the time
continuum, and (ii) there is no room for choice as to the lines which are to
be called "straight" and hence no choice among alternative metric
geometries of physical space, since the geodesic requirement dfds — 0,
which must be satisfied by the straight lines, is imposed subject to the
restriction that only intrinsically congruent line elements may be assigned
the same length ds.
These considerations show that it will not suffice in this context simply
to take the model-theoretic conception of geometry for granted and there-
by to dismiss the Russell- Helmholtz claim peremptorily in favor of alter-
native metrizability. Rather what is needed is a refutation of the Russell-
Helmholtz root-assumption of an intrinsic metric: to exhibit the un-
tenability of that assumption is to provide the justification of the model-
theoretic affirmation that a given set of physico-spatial facts may be held
208 ADOLF GRUNBAUM
to be as much a realization of a Euclidean calculus as of a wow-Euclidean
one yielding the same topology.
We shall now see how Riemann and Poincare furnished the philosophi-
cal underpinning for that affirmation.
The following statement in Riemann's Inaugural Dissertation [24, pp.
274, 286] contains a fundamental insight into the particular character of
the continuous manifolds of space and time :
"Definite parts of a manifold, which are distinguished from one
another by a mark or boundary are called quanta. Their quantitative
comparison is effected by means of counting in the case of discrete
magnitudes and by measurement in the case of continuous ones. 2
Measurement consists in bringing the magnitudes to be compared
into coincidence; for measurement, one therefore needs a means
which can be applied (transported) as a standard of magnitude. If it
is lacking, then two magnitudes can be compared only if one is a
[proper] part of the other and then only according to more or less,
not with respect to how much. ... in the case of a discrete manifold,
the principle [criterion] of the metric relations is already implicit
in [intrinsic to] the concept of this manifold, whereas in the case of a
continuous manifold, it must be brought in from elsewhere [ex-
trinsicallyj. Thus, either the reality underlying space must form a
discrete manifold or the reason for the metric relations must be
sought extrinsically in binding forces which act on the manifold."
Russell [28, pp. 66-67] and the writer [13] have noted that, contrary to
Riemann's apparent expectation, the first part of this statement will not
bear critical scrutiny as a characterization of continuous manifolds in
general. Riemann does, however, render here a fundamental feature of
the continua of physical space and time, which are manifolds whose
elements, taken singly, all have zero magnitude. And since our concern is
with the geo-chronometry of continuous physical space and time, we can
disregard defects in his account which do not affect its pertinence to the
latter continua. By the same token, we can ignore inadequacies arising
from his treatment of discrete and continuous types of order as jointly
exhaustive. Instead, we state the valid upshot of his conception relevant
to the spatio-temporal congruence issue before us. Construing his state-
ment as applying, not only to lengths but also, mutatis mutandis, to
areas and to volumes of higher dimensions, he gives the following
2 Riemann apparently does not consider sets which are neither discrete nor
continuous, but we shall consider the significance of that omission below.
CONVENTIONALISM IN GEOMETRY 209
sufficient condition for the intrinsic definability and non-definability
of a metric without claiming it to be necessary as well: in the case of a
discretely-ordered set, the "distance" between two elements can be
defined intrinsically in a rather natural way by the cardinality of the
"interval" determined by these elements. 3 On the other hand, upon
confronting the extended continuous manifolds of physical space and
time, we see that neither the cardinality of intervals nor any of their
other topological properties provide a basis for an intrinsically-defined
metric. The first part of this conclusion was tellingly emphasized by
Cantor's proof of the equi-cardinality of all positive intervals independent-
ly of their length. Thus, there is no intrinsic attribute of the space between
the end-points of a line-segment AB, or any relation between these two
points themselves, in virtue of which the interval AB could be said to
contain the same amount of space as the space between the termini of
another interval CD not coinciding with ^47?. Corresponding remarks
apply to the time continuum. Accordingly, the continuity we postulate
for physical space and time furnishes a sufficient condition for their
intrinsic metrical amorphousness. 4
3 The basis for the discrete ordering is not here at issue : it can be conventional,
as in the case of the letters of the alphabet, or it may arise from special properties
and relations characterizing the objects possessing the specified order.
4 Clearly, this does not preclude the existence of sufficient conditions other than
continuity for the intrinsic metrical amorphousness of sets. But one cannot invoke
densely-ordered, denumerable sets of points (instants) in an endeavor to show that
discontinuous sets of such elements may likewise lack an intrinsic metric : even
without measure theory, ordinary analytic geometry allows the deduction that the
length of a demtmerably infinite point set is intrinsically zero. This result is evident
from the fact that since each point (more accurately, each unit point set or degener-
ate subinterval) has length zero, we obtain zero as the intrinsic length of the densely-
ordered denumerable point set upon summing, in accord with the usual limit
definition, the sequence of zero lengths obtainable by denumcration (cf . Griinbaum
[1 1, pp. 297-298]). More generally, the measure of a denumerable point set is always
zero (cf. Hobsoii [15, p. 166]) unless one succeeds in developing a very restrictive
intuition istic measure theory of some sort.
These considerations show incidentally that space-intervals cannot be held to be
merely denumerable aggregates. Hence in the context of our post-Cantorcan mean-
ing of "continuous", it is actually not as damaging to Riemann's statement as it
might seem prima facie that he neglected the denumerable dense sets by incorrectly
treating the discrete and continuous types of order as jointly exhaustive. Moreover,
since the distinction between denumerable and super-deiiumerable dense sets was
almost certainly unknown to Riemann, it is likely that by "continuous" he merely
intended the property which we now call "dense". Evidence of such an earlier usage
of "continuous" is found as late as 1914: cf. Russell [27, p. 138].
210 ADOLF GRUNBAUM
The axioms of congruence [35, pp. 42-50] preempt "congruent'' to be a
spatial equality predicate but allow an infinitude of mutually-exclusive
congruence classes of intervals. There are no intrinsic metric attributes of
intervals, however, which could be invoked to single out one of these
congruence classes as unique. Hence only the choice of a particular
extrinsic congruence standard can determine a unique congruence class,
the rigidity of that standard under transport being decreed by convention.
And thus the role of this standard cannot be construed with Russell to be
the mere ascertainment of an otherwise intrinsic equality obtaining
between the intervals belonging to the congruence class defined by it.
Similarly for time intervals and the periodic devices which define temporal
congruence. And hence there can be no question at all of an empirically or
factually determinate metric geometry or chronometry until after a
physical stipulation of congruence.5
A concluding remark on the special importance of the equality term
"congruent" (for line segments) vis-a-vis the other primitives of the
calculus will precede turning our attention to some of the import of the
conventionality of congruence.
Suitable alternative semantical interpretations of the term "con-
gruent", and correlatively of "straight line/' can readily demonstrate
that, subject to the restrictions imposed by the existing topology, it is
always a live option to give either a Euclidean or a non-Euclidean de-
scription of the same body of physico-geometrical facts. The possibility
of alternative semantical interpretations of such other primitives of rival
geometrical calculi as "point" does not generally have such relevance to
this demonstration. Accordingly, when one is concerned, as we are here,
with noting that, even apart from the logic of induction, the empirical
facts themselves do not uniquely dictate the truth of either Euclidean
geometry or of one of its non-Euclidean rivals, then the situation is as
follows: the different physical interpretations of the term "congruent"
(and hence of "straight line") in the respective geometrical calculi enjoy a
more central importance in the discussion than the semantics of such
other primitives of these calculi as "point," since the latter generally have
the same physical meaning in both the Euclidean and non-Euclidean de-
scriptions. Moreover, once we cease to look at physical geometry as a
descriptively-interpreted system of abstract synthetic geometry and regard
it instead as an interpreted system of abstract differential geometry of the
5 For a detailed critique of A. N. Whitehead's perceptualistic objections to this
conclusion [34, ch. VI; 35, ch. Ill; 36, passim] see Griinbaum [13].
CONVENTIONALISM IN GEOMETRY 211
Gauss-Riemann type, the pre-eminent status of the interpretation of
"congruent" is seen to be beyond dispute: by choosing a particular
distance function ds = Vgtjcdxidxk for the line element, we specify not
only what segments are congruent and what lines are straights (geodesies)
but the entire geometry, since the metric tensor g^ fully determines the
Gaussian curvature K. To be sure, if one were discussing not the alter-
native between a Euclidean and non-Euclidean description of the same
spatial facts but rather the set of all models (including wow-spatial ones) of
a given calculus, say the Euclidean one, then indeed the physical inter-
pretation of "congruent" and of "straight line" would not merit any more
attention than that of other primitives like "point".
The Import of Riemann's Conception of Congruence.
(a) F. Klein's Relative Consistency Proof of Hyperbolic Geometry and
H. Poincare*'s Anschaulichkeitsbeweis of that geometry.
In the light of the conventionality of congruence, F. Klein's relative
consistency proof of hyperbolic geometry via a model furnished by the
interior of a circle on the Euclidean plane 6 appears as merely one par-
ticular kind of possible remetrization of the circular portion of that plane,
protective geometry having played the heuristic role of furnishing Klein
with a suitable definition of congruence. What from the point of view of
synthetic geometry appears as intertranslatability via a dictionary, appears
as alternative metrizability from the point of view of differential geometry.
Again, Poincarc's kind of Anschaulichkeitsbeweis of a three-dimensional
hyperbolic geometry via a model furnished by the interior of a sphere in
Euclidean space [20, pp. 75-8] is another example of remetrization. Here
the alteration in our customary definition of congruence is conveyed to
us pictorially by the effects of an inhomogeneous force field which
appropriately shrinks all bodies alike as seen from the point of view of the
normally Euclideanly-behaving bodies.
(b) Poincare and the Conventionality of Congruence.
The central theme of Poincare's so called conventionalism is essentially
an elaboration of the thesis of alternative metrizability whose fundamen-
tal justification we owe to Riemann, and not [12, §5] the radical con-
ventionalism attributed to him by Reichenbach [23, p. 36].
Poincare's much-cited and often misunderstood statement concerning the
possibility of always giving a Euclidean description of any results of stellar
parallax measurements is a less lucid statement of exactly the same point
6 For details, cf. Bonola [1, pp. 164-175]. For a summary of E. Beltrami's differ-
ent relative consistency proof, see Struik [31, pp. 152-3].
212 ADOLF GRUNBAUM
made by him with magisterial clarity in the following passage [20, p. 235] :
"In space we know rectilinear triangles the sum of whose angles is
equal to two right angles ; but equally we know curvilinear triangles
the sum of whose angles is less than two right angles. ... To give the
name of straights to the sides of the first is to adopt Euclidean geome-
try; to give the name of straights to the sides of the latter is to adopt
the non-Euclidean geometry. So that to ask what geometry it is proper
to adopt is to ask, to what line is it proper to give the name straight ?
It is evident that experiment can not settle such a question."
Now, the equivalence of this contention to Riemann's view of con-
gruence becomes evident the moment we note that the legitimacy of
identifying lines which are curvilinear in the usual geometrical parlance
as "straights" is vouchsafed by the warrant for our choosing a new defi-
nition of congruence such that the previously curvilinear lines become
geodesies of the new congruence. Corresponding remarks apply to Pom-
care's contention that we can always preserve Euclidean geometry in
the face of any data obtained from stellar parallax measurements: if the
paths of light rays are geodesies on a particular definition of congruence,
as indeed they are in the Schwarzschild procedure cited by Robertson,
and if the paths of light rays are found parallactically to sustain non-
Euclidean relations on that metrization, then we need only choose a
different definition of congruence such that these same paths will no
longer be geodesies and that the geodesies of the newly chosen congruence
are Euclideanly related. From the standpoint of synthetic geometry, the
latter choice effects a renaming of optical and other paths and thus is
merely a recasting of the same factual content in Euclidean language
rather than a revision of the extra-linguistic content of optical and other
laws1. Since Poincarc's claim here is a straightforward elaboration of the
metric amorphousness of the continuous manifold of space, it is not clear
how Robertson can reject it as a "pontifical pronouncement" and even
regard it as being in contrast with what he calls Schwarzschild's "sound
operational approach to the problem of physical geometry." [25, pp.
324-5]. For Schwarzschild had rendered the question concerning the
prevailing geometry factual only by the adoption of a particular spatial
7 The remetrizational retainability of Euclidcanism affirmed by Poincar6 [20,
pp. 81-86] thus involves a merely linguistic interdependence of the geometric theory
of rigid solids and the optical theory of light rays. This interdependence is logically
different, as we shall see in Section 3, from P. Duhem's conception [6, Part II, ch.
VI] of an epistemological interdependence, which Einstein espouses.
CONVENTIONALISM IN GEOMETRY 213
metrization based on the travel times of light, which does indeed turn the
direct light paths of his astronomical triangle into geodesies.
There are two respects, however, in which Poincare is open to criticism
in this connection :
(i) He maintained [20, p. 81] that it would always be regarded as most
convenient to preserve Euclidean geometry, even at the price of re-
metrization, on the grounds that this geometry is the simplest ana-
lytically [20, p. 65]. Precisely the opposite development materialized in
the general theory of relativity: Einstein forsook the simplicity of the
geometry itself in the interests of being able to maximize the simplicity
of the definition of congruence. He makes clear in his fundamental paper
of 1916 that had he insisted on the retention of Euclidean geometry in a
gravitational field, then he could not have taken "one and the same rod,
independently of its place and orientation, as a realization of the same
interval." [8, p. 161]
(ii) Even if the simplicity of the geometry itself were the sole determi-
nant of its adoption, that simplicity might be judged by criteria other
than Poincarc's analytical simplicity. Thus, Menger has urged that
from the point of view of a criterion grounded on the simplicity of the
undefined concepts used, hyperbolic and not Euclidean geometry is the
simplest [16, p. 66J.
On the other hand, if Poincare were alive today, he could point to an
interesting recent illustration of the sacrifice of the simplicity and
accessibility of the congruence standard on the altar of maximum
simplicity of the resulting theory. Astronomers have recently proposed
to remetrize the time continuum for the following reason : when the mean
solar second, which is a very precisely known fraction of the period of
the earth's rotation on its axis, is used as a standard of temporal con-
gruence, then there are three kinds of discrepancies between the actual
observational findings and those predicted by the usual theory of celestial
mechanics. The empirical facts thus present astronomers with the follow-
ing choice: Either they retain the rather natural standard of temporal
congruence at the cost of having to bring the principles of celestial
mechanics into conformity with observed fact by revising them appropri-
ately. Or they remetrize the time continuum, employing a less simple
definition of congruence so as to preserve these principles intact. Decisions
taken by astronomers in the last few years were exactly the reverse of
Einstein's choice of 1916 as between the simplicity of the standard of
congruence and that of the resulting theory. The mean solar second is to
214 ADOLF GRUNBAUM
be supplanted by a unit to which it is non-linearly related: the sidereal
year, which is the period of the earth's revolution around the sun, due
account being taken of the irregularities produced by the gravitational
influence of the other planets. 8
We see that the implementation of the requirement of descriptive
simplicity in theory-construction can take alternative forms, because
agreement of astronomical theory with the evidence now available is
achievable by revising either the definition of temporal congruence or the
postulates of celestial mechanics. The existence of this alternative likewise
illustrates that for an axiomatized physical theory containing a geo-
chronometry, it is gratuitous to single out the postulates of the theory as
having been prompted by empirical findings in contradistinction to
deeming the definitions of congruence to be wholly a priori, or vice versa.
This conclusion bears out geochronometrically Braithwaite's contention
in this volume that there is an important sense in which axiomatized
physical theory does not lend itself to compliance with Heinrich Hertz's
injunction to "distinguish thoroughly and sharply between the ele-
ments . . . which arise from the necessities of thought, from experience,
and from arbitrary choice/' [14, p. 8]. 9
(c) The impossibility of defining congruence uniquely by stipulating a
particular metric geometry.
A question which arises naturally upon undertaking the mathematical
implementation of a given choice of a metric geometry in the context of a
particular set of topological facts is the following: do these facts in con-
junction with the desired metric geometry determine a unique definition
of congruence? If the answer were actually in the affirmative, as both
Carnap [3, pp. 54-55] and Reichenbach [23, pp. 33-34; 22, pp. 132-133]
have maintained, this would mean that the desired geometry would
uniquely specify a metric tensor under given factual circumstances and
thus, in a particular coordinate system, a unique set of functions guc.
But Carnap's and Reichenbach's assertion of uniqueness is erroneous, as
is demonstrated by showing that besides the customary definition of
congruence, which assigns the same length to the measuring rod every-
where and thereby confers a Euclidean geometry on an ordinary table top,
there are infinitely many other definitions of congruence which likewise
8 For a clear account of the relevant astronomical details, see Clemence [4].
9 Braithwaite's point was made independently by Pap [19], who argues that the
analytic-synthetic distinction cannot be upheld for partially -interpreted theoretical
languages like that of theoretical physics.
CONVENTIONALISM IN GEOMETRY 215
yield a Euclidean geometry for that surface but which make the length of a
rod depend on its orientation or position. Thus, consider our horizontal
table top equipped with a net-work of Cartesian coordinates x and y and
suppose that another such surface intersects the horizontal one at an
angle 0 so that their line of intersection is both the y-axis of the horizontal
plane and the J7-axis of a rectangular system of coordinates x and y on the
inclined plane. Assume that the inclined plane has been metrized in the
customary way. But then remetrize the horizontal plane by calling con-
gruent in it those line segments which are the perpendicular projections
onto it of segments of the inclined plane that are equal in the latter's
metric. Accordingly, we have a mapping
x = x sec 0
y = y>
and we now assign to a line segment of the horizontal plane whose termini
have the coordinate differences dx and dy not the customary length
Vdx* + dy2 but rather
ds = Vdx* +~d$* = Vsec2 Odx* + dy2.
Nonetheless, upon using the new gM, which are introduced into the x, y
coordinates by the revised definition of congruence, to compute the
Gaussian curvature of the horizontal table top, we still obtain the Eucli-
dean value zero. And by merely varying the angle of inclination 0, we ob-
tain infinitely many different definitions of congruence all of which make
the length of a given rod dependent on its orientation and yet impart a
Euclidean geometry to the horizontal table top. Thus, the requirement
of Euclideanism does not uniquely determine a metric tensor, and,
contrary to Carnap and Reichenbach, there are infinitely many ways in
which a measuring rod could squirm under transport as compared to its
customary behavior and still yield a Euclidean geometry. In fact, even for
plane Euclidean geometry, the class of congruence definitions is far wider
than the one-parameter family yielded by our particular isometric map-
pings of an inclined plane onto the horizontal one. Dr. Samuel Gulden,
to whom I presented the problem of determining the class of different
metric tensors for each kind of two-dimensional and three-dimensional
Riemannian space, has pointed out that (i) in the Euclidean case, upon
abandoning the restriction of our above isometric mappings to affine
coordinate transformations and considering non-linear transformations
with non-vanishing Jacobian, we can generate infinitely many other
216 ADOLF GRUNBAUM
metrizations whose associated Gaussian curvature is everywhere zero.
For example, for the admissible transformation between our two sets of
rectangular coordinates x, y and % , y given by
x = x + %y3, and
y = b3 — y>
the distance function becomes
ds* = dx* + dy* = (1 + x*)dx* + 2(y2 — x*)dxdy + (y4 + \)dy*.
In this case, the length of a given rod is generally dependent both on
its position and on its orientation, (ii) the result obtained for Euclidean
space can be generalized to a very large class of Ricmann spaces of
various dimensions.
We are now ready to consider the second of the two principal problems
which have been posed in connection with the criterion of rigidity.
3. The Criterion of Rigidity : II. The Logic of Correcting for "Distorting"
Influences. Physical geometry is usually conceived as the system of metric
relations exhibited by transported solid bodies independently of their
particular chemical composition. On this conception, the criterion of
congruence can be furnished by a transported solid body for the purpose
of determining the geometry by measurement, only if the computational
application of suitable "corrections" (or, ideally, appropriate shielding)
has essentially eliminated inhomogeneous thermal, elastic, electric and
other influences, which produce changes of varying degree ("distortions")
in different kinds of materials. The demand for this elimination as a
prerequisite to the experimental determination of the geometry has a
thermodynamic counterpart : the requirement of a means for measuring
temperature which does not yield the discordant results produced by ex-
pansion thermometers at other than fixed points when different thermo-
metric substances are employed. This thermometric need is fulfilled
successfully by Kelvin's thermodynamic scale of temperature. But
attention to the implementation of the corresponding prerequisite of
physical geometry has led Einstein [9, pp. 676-678] to impugn the em-
pirical status of that geometry. He considers the case in which congruence
has been defined by the diverse kinds of transported solid measuring rods
as corrected for their respective idiosyncratic distortions with a view to then
making an empirical determination of the prevailing geometry. And in an
CONVENTIONALISM IN GEOMETRY 217
argument which he attributes to Poincare, Einstein's thesis is that the
very logic of computing these corrections precludes that the geometry
itself be accessible to experimental ascertainment in isolation from other
physical regularities. Specifically, he states the case in the form of a
dialogue between Reichenbach and Poincare 10:
"Poincare: The empirically given bodies are not rigid, and conse-
quently can not be used for the embodiment of geometric intervals.
Therefore, the theorems of geometry are not verifiable.
Reichenbach: I admit that there are no bodies which can be im-
mediately adduced for the "real definition" of the interval. Never-
theless, this real definition can be achieved by taking the thermal
volume-dependence, elasticity, electro- and magneto-striction, etc.,
into consideration. That this is really [and] without contradiction
possible, classical physics has surely demonstrated.
Poincare: In gaining the real definition improved by yourself you
have made use of physical laws, the formulation of which presupposes
(in this case) Euclidean geometry. The verification, of which you
have spoken, refers, therefore, not merely to geometry but to the
entire system of physical laws which constitute its foundation. An
examination of geometry by itself is consequently not thinkable.
— Why should it consequently not be entirely up to me to choose
geometry according to my own convenience (i.e., Euclidean) and to
fit the remaining (in the usual sense "physical") laws to this choice
in such manner that there can arise no contradiction of the whole
with experience?"
The objection which Einstein presents here on behalf of conventionalism
is aimed at a conception of physical geometry which is empiricist merely
in Carnap's and Reichenbach's conditional sense explained in Section 1 .
Einstein's criticism is that the rigid body is not even defined without
first decreeing the validity of Euclidean geometry. And the grounds he
gives for this conclusion are that before the corrected rod can be used to
make an empirical determination of the de facto geometry, the required
corrections must be computed via laws, such as those of elasticity, which
involve Euclideanly-calculated areas and volumes. But clearly the warrant
10 It is rather doubtful that Poincare* himself espoused the version of convention-
alism which Einstein links to his name here: in speaking of the variations which
solids exhibit under distorting influences, Poincar6 says [20, p. 76 J : "we neglect these
variations in laying the foundations of geometry, because, besides their being very
slight, they are irregular and consequently seem to us accidental."
218 ADOLF GRUNBAUM
for thus introducing Euclidean geometry at this stage cannot be empirical.
I now wish to set forth my reasons for believing that Einstein's argu-
ment does not succeed in making physical geometry a matter of con-
vention rather than fact in a sense which is independent of the alternative
metrizability vouchsafed by spatio-temporal continuity.
There is no question that the laws used to make the corrections for
deformations [30, p. 60; 32, p. 408] involve areas and volumes in a funda-
mental way (e.g. in the definitions of the elastic stresses and strains) and
that this involvement presupposes a geometry, as is evident from the area
and volume formulae
A=f^/g dx^dx* and V =f ^g dxidx*dx*,
where "g" represents the determinant of the components gw [10, p. 177].
Now suppose that we begin with a set of Euclideanly-formulated physical
laws PQ in correcting for the distortions induced by perturbations and
then use the thus Euclideanly-corrected congruence standard for empiri-
cally exploring the geometry of space by determining the metric tensor.
The initial stipulational affirmation of the Euclidean geometry Go in the
physical laws PQ used to compute the corrections in no way assures that
the geometry obtained by the corrected rods will be Euclidean! If it is non-
Euclidean, then the question is: what will Einstein's fitting of the
physical laws to preserve Euclideanism and avoid a contradiction of the
total theoretical system with experience involve? Will the adjustments in
PQ necessitated by the retention of Euclideanism entail merely a change
in the dependence of the length assigned to the transported rod on such
non-positional parameters as temperature, pressure, magnetic field etc. ?
Or could the putative empirical findings compel that the length of the
transported rod be likewise made a function of its position and orientation
in order to square the coincidence findings with the requirement of
Euclideanism? The temporal variability of distorting influences and the
possibility of obtaining non-Euclidean results by measurements carried
out in a spatial region uniformly characterized by standard conditions of
temperature, pressure, electric and magnetic field strength etc. show it to
be quite doubtful that the preservation of Euclideanism could always be
accomplished short of introducing the dependence of the rod's length on
position and orientation. Thus, the need for remetrizing in this sense in
order to retain Euclideanism cannot be ruled out. But this kind of re-
metrization does not provide the requisite support for Einstein's version
of conventionalism, whose onus it is to show that the geometry by itself
CONVENTIONALISM IN GEOMETRY 219
cannot be held to be empirical even when we exclude resorting to such
remetrization.
That the geometry may well be empirical in this sense is seen from the
following possibilities of its successful empirical determination. After
assumcdly obtaining a non-Euclidean geometry G\ from measure-
ments with a rod corrected on the basis of Euclideanly-formulated physi-
cal laws PQ, we can revise jPo so as to conform to the non-Euclidean
geometry GI just obtained by measurement. This retroactive revision of
PQ would be effected by recalculating such quantities as areas and vo-
lumes on the basis of GI and changing the functional dependencies
relating them to temperature and other physical parameters. We thus ob-
tain a new set of laws PI. Now we use this set PI of laws to correct the
rods for perturbational influences and then determine the geometry with
the thus corrected rods. If the result is a geometry G% different from GI,
then if there is convergence to a geometry of constant curvature, we must
repeat this process a finite number of times until the geometry Gn
ingredient in the laws Pn providing the basis for perturbation-corrections
is indeed the same to within experimental accuracy as the geometry
obtained by measurements with rods that have been corrected via the set
Pn.
If there is such convergence at all, it will be to the same geometry Gn
even if the physical laws used in making the initial corrections are not the
set PQ, which presupposes Euclidean geometry, but a different set P
based on some wow-Euclidean geometry or other. That there can exist only
one such geometry of constant curvature Gn would seem to be guaranteed
by the identity of Gn with the unique underlying geometry Gt character-
ized by the following properties : (i) Gt would be exhibited by the coinci-
dence behavior of a transported rod if the whole of the space were actually
free of deforming influences, (ii) Gt would be obtained by measurements
with rods corrected for distortions on the basis of physical laws Pt
presupposing Gt, and (iii) Gt would be found to prevail in a given relatively
small, perturbation-free region of the space quite independently of the
assumed geometry ingredient in the correctional physical laws. Hence, if
our method of successive approximation does converge to a geometry Gn
of constant curvature, then Gn would be this unique underlying geometry
Gt. And, in that event, we can claim to have found empirically that Gt is
indeed the geometry prevailing in the entire space which we have explored.
But what if there is no convergence? It might happen that whereas
convergence would obtain by starting out with corrections based on the
220 ADOLF GRUNBAUM
set PO of physical laws, it would not obtain by beginning instead with
corrections presupposing some particular non-Euclidean set P or vice
versa: just as in the case of Newton's method of successive approximation
[5, p. 286], there are conditions, as A. Suna has pointed out to me, under
which there would be no convergence. We might then nonetheless
succeed as follows in finding the geometry Gt empirically, if our space is
one of constant curvature.
The geometry Gr resulting from measurements by means of a corrected
rod is a single-valued function of the geometry Ga assumed in the cor-
rectional physical laws, and a Laplacian demon having sufficient know-
ledge of the facts of the world would know this function Gr — / (Ga).
Accordingly, we can formulate the problem of determining the geometry
empirically as the problem of finding the point of intersection between the
curve representing this function and the straight line Gr — Ga. That there
exists one and only one such point of intersection follows from the
existence of the geometry Gt defined above, provided that our space is
one of constant curvature. Thus, what is now needed is to make determi-
nations of the Gr corresponding to a number of geometrically-different
sets of correctional physical laws Pa, to draw the most reasonable curve
Gr = / (Ga) through this finite number of points (Gn, Gr), and then to find
the point of intersection of this curve and the straight line Gr — Ga.
Whether this point of intersection turns out to be the one representing
Euclidean geometry or not is beyond the reach of our conventions,
barring a remetrization. And thus the least that we can conclude is that
since empirical findings can greatly narrow down the range of uncertainty
as to the prevailing geometry, there is no assurance of the latitude for the
choice of a geometry which Einstein takes for granted. Einstein's Duhe-
mian position would appear to be inescapable only if our proposed method
of determining the geometry by itself empirically cannot be generalized in
some way to cover the general relativity case of a space of variable
curvature and if the latter kind of theory turns out to be true.
It would seem therefore that, contrary to Einstein, the logic of elimi-
nating distorting influences prior to stipulating the rigidity of a solid body
is not such as to provide scope for the ingression of conventions over and
above those acknowledged in RiemamYs analysis of congruence, and trivial
ones such as the system of units used. Our analysis of the logical status of
the concept of a rigid body thus leads to the conclusion that once the
physical meaning of congruence has been stipulated by reference to a
solid body for whose distortions allowance has been made compu-
CONVENTIONALISM IN GEOMETRY 221
tationally as outlined, then the geometry is determined uniquely by the
totality of relevant empirical facts. It is true, of course, that even apart
from experimental errors, not to speak of quantum limitations on the
accuracy with which the metric tensor of space-time can be meaningfully
ascertained by measurement [29 ; 37J, no finite number of data can unique-
ly determine the functions constituting the representations guc of the
metric tensor in any given coordinate system. But the criterion of inductive
simplicity which governs the free creativity of the geometer's imagination
in his choice of a particular metric tensor here is the same as the one
employed in theory formation in any of the non-geometrical portions of
empirical science. And choices made on the basis of such inductive
simplicity are in principle true or false, unlike those springing from
considerations of descriptive simplicity, which merely reflect conventions.
The author is indebted to Dr. Samuel Gulden of the Department of
Mathematics, Lehigh University, U.S.A. for very helpful discussions.
Bibliography
[1] BONOLA, R., Non-Euclidean Geometry. New York, 1955. IX + 268 pp.
[2] BROWN, F. A., Biological clocks and the fiddler crab. Scientific American, vol.
190 (April, 1954), pp.~34-37.
[3] CARNAP, R., Der Raiim. Berlin, 1922 (Supplement No. 56 of Kant-Studien]
87 pp.
[4] QLKMKNCTC, G. M., Time and its measurement. American Scientist, vol. 40
(1952), pp. 260—269; and Astronomical time. Reviews of Modern Physics, vol.
29 (1957), p. 5.
[5] COURANT, R., Vorlesungen uber Differential- und Integralrechnung, vol. 1.
Berlin, 1927. XIV 1 410pp.
[6] DUHKM, P., The Aim and Structure of Physical Theory. Princeton, 1954.
XXII + 344 pp.
[7] EDDINGTON, A. S., Space, Time and Gravitation. Cambridge, 1953. VII + 218
pp.
[8] EINSTEIN, A., The foundations of the general theory of relativity. In: The
Principle of Relativity, a collection of original memoirs, London, 1923, pp.
111-164.
[9] , Reply to criticisms. In: Albert Einstein: Philosopher-Scientist (edited by
SCHILPP, P. A.) Evanston, 1949, pp. 665-688.
[10] EISENHART, L. P., Riemannian Geometry. Princeton, 1949. VII -j- 306 pp.
[11] GRUNBAUM, A., A consistent conception of the extended linear continuum as an
aggregate of unextended elements. Philosophy of Science, vol. 19 (1952), pp.
288-306.
222 ADOLF GRUNBAUM
[12] , Carnap's views on the foundations of geometry. In: The Philosophy of
Rudolf Carnap (edited by SCHILPP, P. A.), (forthcoming).
[13] , Geometry, Chronometry and Empiricism. In: Minnesota Studies in the
Philosophy of Science (edited by FEIGL, H. and MAXWELL, G.), vol. Ill
(forthcoming).
[14] HERTZ, H., The Principles of Mechanics. New York, 1956, 271 pp.
[15] HOBSON, E. W., The Theory of Functions of a Real Variable, vol. 1. New York,
1957, XV + 736 pp.
[16] MENGER, K., On algebra of geometry and recent progress in non-euclidean
geometry. The Rice Institute Pamphlet, vol. 27 (1940), pp. 41-79.
[17] MILNE, E. A., Kinematic Relativity. Oxford, 1948, VI + 238 pp.
[18] NEWTON, I., Principia (edited by CAJORI, F.). Berkeley, 1947, XXXV +
680 pp.
[19] PAP, A., Are physical magnitudes operationally definable ? In: Measurement'.
Definitions and Theories (edited by CHURCHMAN, C. W. and RATOOSH, P.)
New York, 1959 (in press).
[20] POINCARE, H., The Foundations of Science. Lancaster 1946, XI -f- 553 pp.
[21] REICHENBACH, H., The philosophical significance of the theory of relativity. In:
Albert-Einstein: Philosopher-Scientist (edited by SCHILPP, P. A.) Evanston,
1949, pp. 287-311.
[22] , The Rise of Scientific Philosophy. Berkeley, 1951, XI + 333 pp.
[23] , The Philosophy of Space and Time. New York, 1958, XVI -f 295 pp.
[24] RIEMANN, B., Gesammelte Mathematische Werke (edited by WEBER and DEUE-
KIND). New York, 1953, X H 558 pp.
[25] ROBERTSON, H. P., Geometry as a branch of physics. In: Albert Einstein:
Philosopher-Scientist (edited by SCHILPP, P. A.). Evanston, 1949, pp. 313-332.
[26] RUSSELL, B., Sur les axiomes de la geometrie. Revue de Mtftaphysique et de
Morale, vol. 7 (1899), pp. 684-707.
[27] , Our Knowledge of the External World. London, 1926, 251 pp.
[28] , The Foundations of Geometry. New York, 1956. 201 pp.
[29] SALECKER, H. and WIGNER, E. P., Quantum Limitations of the Measurement of
Space-Time Distances. The Physical Review, vol. 109 (1958), pp. 571-577.
[30] SOKOLNIKOFF, I. S., Mathematical Theory of Elasticity. New York, 1946,
XI + 373 pp.
[31] STRUIK, D. J., Classical Differential Geometry. Cambridge, 1950, VIII -f 221
pp.
[32] TIMOSHENKO, S. and GOODIER, J. N., Theory of Elasticity. New York, 1951,
XVIII |. 506 pp.
[33] VON HKLMHOLTZ, H., Schriften zur Erkenntnistheorie (edited by HERTZ, P.
and SCHLICK, M.). Berlin, 1921, IX + 175 pp.
[34] WHITEHEAD, A. N., The Concept of Nature. Cambridge, 1926, VIII -f 202 pp.
[35] t The Principle of Relativity. Cambridge, 1922, XII + 190 pp.
[36] , Process and Reality. New York, 1929, XII + 546 pp.
[37] WIGNER, E. P., Relativistic invariance and quantum phenomena. Reviews of
Modern Physics, vol. 29 (1957), pp. 255-268.
PART II
FOUNDATIONS OF PHYSICS
Symposium on the Axiomatic Method
HOW MUCH RIGOR IS POSSIBLE IN PHYSICS?
P. W. BRIDGMAN
Harvard University, Cambridge, Massachusetts, U.S.A.
Let me begin by saying that I have accepted the invitation to speak to
this Symposium on the Axiomatic Method with extreme hesitation.
I think I realize that there is a highly developed axiomatic technique and
that to many of you the questions of greatest interest in this field are
questions of technique. To an outsider like myself the spectacle of the
virtuosity exhibited by some of you in the practise of this technique is a
little terifying. I realize that many, if not all of you, will be impatient
with the generalities which I have to offer and will be eager to get on with
the more vital business of detailed attack on the numerous technical
problems. I cannot even hope that my generalities will not seem to you
too obvious to be worth saying, and that I may appear in the light of an
enfant terrible, blurting out the things that everyone knows but has too
much sense to say out loud. If, in spite of all this, I am venturing to talk
to you, it is partly selfish because it appeared that I could not otherwise
attend this meeting, and I expect, in spite of your technicalities, to pick
up points of view which will be new and profitable. But beyond this, I do
think that it is worth while, occasionally, to say the obvious things out
loud, for I do not believe that we have, even yet, taken into account all
the obvious things. In any event, I am glad that the program committee
put my paper in the opening session, so that you can soon get it out of the
way and turn to more interesting and pressing matters.
The "rigor" which I shall talk about is not itself a very precise or
rigorous thing. In its first usage "rigor" is applied to reasoning. If,
however, rigorous reasoning is to be possible, the objects and operations
of our reasoning must have certain properties, so that "rigor" comes to
have an extended meaning. In this extended meaning it implies sharpness
and precision and it has overtones of certainty. It is in this extended
sense that I shall be concerned with rigor. My task will be to examine to
what extent what we do in physics can have the attributes of sharpness,
precision, and certainty. I shall assume as not needing argument that in
no field of activity are these attributes actually attainable, but they
225
226 P. W. BRIDGMAN
function only as limiting ideals, which are never fully attained even in as
abstract a domain as that of postulate theory.
All human enterprise, of which postulate theory and physics are
special cases, is subject to one restriction on any attainable sharpness or
certainty which is so ubiquitous and unavoidable that we seldom bother
even to mention it. The possibility of self-doubt is always with us; we
can always ask ourselves whether we are really doing what we think we
are doing or how we can be sure that we have not suddenly gone insane
or are not dreaming. All our intellectual activity not only is, but has to
be, based on the premise that intellectually we are going concerns. In
so far as this is common to postulate theory and the physics of the
laboratory I need not stop to elaborate the point further. It seems to me,
however, that there are points here which in another context might be
analyzed further than they usually are. Just what is involved in the
assumption that I am a going concern intellectually ? and how shall I go
to work to assure myself that the assumption actually applies to me?
In particular what is the method by which I can assure n^self that I am
not now dreaming ? I have seen no such method.
Forgetting now any lack of sharpness arising from self doubt, there are
certain human activities which apparently have perfect sharpness. The
realm of mathematics and of logic is such a realm, par excellence. Here we
have yes-no sharpness — two numbers are cither equal to each other
or they are not ; a certain point either lies on a given line or it does not ;
there is only one straight line connecting any two points. Now it is a
matter of observation that this yes-no sharpness is found only in the
realm of things we say as distinguished from the realm of things we do.
Sharpness is an attribute of the way we talk about our experience, in
particular whether we talk about it in yes-no terms, rather than an
attribute of the experience itself, if you will be charitable enough to
grant me meaning in such a way of expression. Nothing that happens in
the laboratory corresponds to the statement that a given point is either
on a given line or it is not.
There is no question but that we do talk about aspects of experience in
yes-no terms, and in so far as any field of experience has such yes-no
sharpness it has it in virtue of the fact that it is a verbal activity. One may
well question, however, whether we have any right to ascribe such yes-no
properties to any verbal activity. What are these words anyhow? They
are not static things, but are themselves a form of activity which varies in
some way with every so-called repetition of the word. A word as we use
HOW MUCH RIGOR IS POSSIBLE IN PHYSICS? 227
it is part of a terribly complicated system, involving both present struc-
ture in the brain and the past experience of the brain, most of which we
cannot possibly be conscious of. The assumption that we are going con-
cerns intellectually involves much more than merely the absence of self
doubt.
The physics of measurement and of the laboratory does not have the
yes-no sharpness of mathematics, but nevertheless employs conventional
mathematics as an indispensible tool. Every physicist combines in his
own person, to greater or less degree, the experimental physicist who
makes measurements in the laboratory, and the theoretical physicist who
represents the results of the measurements by the numbers of mathe-
matics. These numbers are things that he says or writes on paper. The
jump by which he passes from the operations of the laboratory to what
he mathematically says about the operations is a jump which may not be
bridged logically, and is furthermore a jump which ignores certain es-
sential features of the physical situation. For the mathematics which
the physicist uses does not exactly correspond to what happens to him.
In the laboratory every measurement is fuzzy because of error. As far as
reproducing what happens to him is concerned, the mathematics of the
physicist might equally well be the mathematics of the rational numbers,
in which such irrationals as -y/2 or pi do not occur. Now one would cer-
tainly be going out of one's way to attempt to force theoretical physics
into a straight jacket of the mathematics of the rational numbers as
distinguished from the mathematics of all real numbers, but by forcing it
into the straight jacket of any kind of mathematics at all, with its yes-no
sharpness, one is discarding an essential aspect of all physical experience
and to that extent renouncing the possibility of exactly reproducing that
experience. In this sense, the commitment of physics to the use of mathe-
matics itself constitutes, paradoxically, a renunciation of the possibility
of rigor.
The unavoidable presence of error in any physical measurement which
we are here insisting on reminds one of the fuzziness in the measurement
of conjugate quantities covered by the Heisenberg principle of inde-
termination, but is, I believe, something quite different. The sort of error
that we here are concerned with would still be present in our knowledge of
the so-called "pure case" of quantum mechanics. In so far as quantum
theory treats, for example, the charge on the electron or Planck's constant
as mathematically sharp numbers, as it does, it is in so far neglecting an
essential aspect of all our experience. It used to be thought that the errors
228 P. W. BRIDGMAN
of physical measurement were a more or less irrelevant epiphenomenon,
which could be avoided in the limit by the construction of better and bet-
ter measuring apparatus. This happy conviction appeared less compelling
when the atomic structure of all matter was established, including the
atomic structure of the measuring apparatus. Now, it appears to me, the
linkage of error with every sort of physical measurement must be re-
garded as inevitable when it is considered that the knowledge of the
measurement, which is all we can be concerned with, is a result of the
coupling of the external situation with a human brain. Even if we had
adequate knowledge of the details of this coupling we admittedly could
not yet use this knowledge informulating in detail how the unavoidable
fuzziness should be incorporated in our description of the world nor how
we should modify our present use of mathematics. About the only thing
we can do at present is to continue in our present use of mathematics,
but with the addition of a caveat to every equation, warning that things
are not quite as they seem.
Quantum theory has effectively called to attention certain other
important features of the world about us. The realm in which quantum
effects are usually considered to be important is in the first instance the
realm of small things — small distances and short times. Phenomena in
this realm do not present themselves directly to our unaided senses, but
occur only in conjunction with special types of instrument, with which we
say that we "extend" the scope of our senses. But if we examine what we
actually do, we see that these instruments function through our con-
ventional senses. Hence, it does not reproduce what actually happens to
say, for example, that the microscope reveals to us a new ''microscopic
world". The so-called microscopic world is really a new macroscopic
world which we have found how to enter by inventing new kinds of
macroscopic instrument. The "world" of quantum phenomena eventually
has to find its description and explanation in terms of the things that
happen to us on the macroscopic scale of every day life. I think most
quantum theorists will admit this if they are pressed, but in spite of this
the language of ordinary quantum theory is a language of microscopic
entities which we handle verbally just as if they had the existential status
of the objects of daily life. There is ample justification for this in the
enormous simplification which results in our description and our handling
of experience. This simplification is nevertheless bought at a price — the
price of neglecting and forgetting some of the unavoidable accompani-
ments of all our experience. By thus agreeing to blur some of the recog-
HOW MUCH RIGOR IS POSSIBLE IN PHYSICS? 229
nizable aspects of experience we have at the same time condemned
ourselves to a loss of possible rigor, using "rigor1 ' with the implications
already explained. This sort of thing is by no means characteristic ex-
clusively of quantum theory — strictly we should never think of bacteria
without thinking of microscopes or think of galaxies without thinking of
telescopes, but such rigor of thought is hardly attainable in practise.
Another matter which quantum theory has forcibly called to our
attention is that the instrument of observation may not properly be
separated from the object of observation. Heisenbcrg's principle of in-
determination is one of the consequences of following out the implications
of this. The principle that instrument of observation is not to be separated
from object of observation is, it seems to me, a special case of a broader
principle, namely that experience has to be taken as a whole and may
not be analyzed into pieces. In other words, the operation of isolation is
not a legitimate operation. Now the operation of isolation is perhaps the
most universal of all intellectual operations, and without it rational
thought would hardly be possible. Nevertheless, in the world of quantum
phenomena situations arise in which our propensity for isolating defi-
nitely gets us into trouble. For instance, the electron is not properly to be
thought of in isolation, but only as an aspect of the total experimental
set-up in which it appears. When we view the electron in this light the
paradox disappears from such situations as the interference pattern
formed by the electron in the presence of two slits, where the electron,
if we treat it as an ordinary isolatable object that can go through only one
of the slits, apparently "knows" of the existence of the other slit without
going through it. We are thus driven to concede that the operation of
isolation cannot be legitimate "in principle", but this concession presents
us with an extraordinarily difficult dilemma, for the very words in which we
express the illegitimacy of the operation of isolation receive their meaning
only in a context of isolation. In practise we meet the situation as best
we can by methods largely intuitive in character which we have acquired
by long practise. But I think that even our best practise has disclosed no
method of sharply handling the situation — the method of isolation is
neither sharply separated from the method of holism, nor is there any
sharp criterion which determines when we shall shift from the one method
to the other. Neither is it possible to express sharply in language what we
mean by the one as distinguished from the other. The best we can do
in practise is a sort of spiralling approximation, shifting back and forth
from one level of operation ot the other, and concentrating our attention
230 P. W. BRIDGMAN
first on one aspect and then on another of the total situation. In such a
setting we cannot expect rigor.
There are many other situations in which the operation of isolation
leads to dilemma and paradox. Long ago, on the classical level, the
concepts of thermodynamics found their meaning in terms of operations
performed on isolated systems. Not only do the fundamental concepts of
energy and entropy receive their meaning in terms of physical systems
isolated in space, but isolation in time is also required, because otherwise
reversibility, or, more generally, recoverability of previous condition,
does not occur. Without recoverability the concepts of thermodynamics
are incapable of definition. This necessity for isolation in the fundamental
definitions leads to logical difficulties when we attempt to extend the
notions of energy or entropy to the universe as a whole. The logical status
of any theorem involving the conservation of the energy of the universe,
or the universal degradation of energy and eventual heat death of the
universe, seems to me exceedingly obscure. Furthermore, the classical
connection between deterministic and statistical mechanics which ex-
presses entropy in probabilistic terms seems to me to involve an ille-
gitimate treatment of the entropy of isolated bodies. It is often said that
an isolated system comprising many molecules approaches, with the
passage of time, a completely disordered state and hence the condition of
maximum entropy, because of the "law of large numbers", in virtue of
which the internal condition eventually becomes one of molecular chaos
in spite of the fact that the laws of the individual molecular encounters
are completely deterministic. This it seems to me is logically fallacious.
Given an isolated system, with a definite initial distribution and deter-
ministic individual encounters, logically it can never evolve into a system
with chaotic distribution. To say that chaos gets in through the operation
of the "law of large numbers" seems to me to introduce a completely
unjustified and ad hoc concept. But chaos may logically get into the
system through the walls which are coupled to the external world. This
coupling is part of a divergent process — the state of the walls may not
be deterministically specified except by coupling them to an ever in-
creasing domain of the external world over which we have ever less
control. The only acceptable method which has been found for dealing
with this divergent process is through probability. Here again we have
paradox — the concept, entropy, is applicable only in a context of isolated
systems, but the detailed mechanism, through the operation of which
entropy functions, occurs only in non-isolated systems.
HOW MUCH RIGOR IS POSSIBLE IN PHYSICS? 231
In general it seems to me that the situations contemplated in proba-
bility analysis are particularly situations in which the jump from theory
to application cannot be made sharply, so that no application of proba-
bility theory can be rigorous. It is particularly important to realize this
now that quantum theory is disposed to regard probabilities as something
fundamental and unanalyzable rather than as an artefact in an es-
sentially deterministic universe. Against this tendency of the theoretical
physicists must be placed, I believe, the realization that the fundamental
concepts of probability have meaning only in the context of a determi-
nistic background. No situation is ever completely chaotic, but it is only
restricted aspects which are probabilistic. We cannot say that a particular
fall of a die is undetermined and probabilistic unless the die itself, the
table top on which it rolls, and we ourselves who observe it and talk
about it, retain their conventional deterministic identity. We have here
a special case of the theorem that eventually any new concepts must find
their meaning on the level of daily life. And since the level of daily life is
preponderantly deterministic, I believe it is impossibel to handle proba-
bility consistently as ultimate and unanalyzable.
"Randomness" is a concept fundamental in probability analysis and
of such importance that lists of random numbers are often printed and
employed in practical applications. Yet theoretically no finite set of
numbers can be completely random, because there are an infinite number
of conditions of randomness. In practise, no set of numbers that has been
printed or otherwise actually exhibited can possibly be random, nor can a
series of events that has actually occurred be random, because it is always
possible to find some sort of regularity in any finite sequence. The con-
cept of randomness, so fundamental to the whole conceptual edifice, thus
appears as a loose concept, incapable of realization in practise. "Random-
ness" occurs only in the realm of things we say.
Probability theory runs into other sorts of difficulty when it deals with
rare events. A literal application of kinetic theory and statistical mecha-
nics yields a calculably small finite probability for any compound event.
Thus there is a finite probability that if we watch long enough we shall
some day see a pail of water freeze on the fire, a conclusion that Bertrand
Russell has delighted to rub in. Or consider another example in somewhat
the same vein. Suppose that I have measured some object by ordinary
laboratory procedures and find it to be 1 .500 meters, with some apparent
uncertainty in the last millimeter. Suppose that I choose to report this
measurement by saying that the length of the object is between 1 and 2
232 P. W. BRIDGMAN
meters. Then probability theory states that there is some probability that
this statement is incorrect. Now it seems to me that a theory which makes
these two statements, about the freezing water and the error of my
measurement, is a theory which fails to agree qualitatively with the
nature of everyday experience. The finite probability of freezing or of
error is a property of our mathematics, not of the situation which the
mathematics is designed to describe, and in thus dealing with rare events
our probability analysis reveals itself as only an approximation. In
general, it seems to me that one has a right to question any probability
analysis which predicts an event so rare that it has not yet been observed.
One might even venture a theorem to this effect. Such a putative theorem
receives a certain justification when it is considered that the prediction of
rare events involves long range extrapolations, which would demand the
establishment of the fundamental laws of mechanics with an accuracy far
beyond that actually attainable.
Our intellectual difficulties are thus not peculiar to the new situations
revealed by quantum theory, but classical physics has always had its
share of difficulty and paradox. Among these difficulties may be men-
tioned these of dealing with continuous media. The equations of hydro-
dynamics, for instance, purportedly deal with continuous media, but the
variables in the equations refer to the motion of "particles" of the fluid,
which, whatever other properties they may have, at least have the
property of identifiability. Whatever it is that bestows the identifiability
would seem to violate the presumptive perfect homogeneity and conti-
nuity of the fluid. The two concepts are mutually contradictory and
exclusive, but nevertheless our thinking seems to demand them, and as
far as I know no one has invented a way of getting along without them.
I believe that there are somewhat similar difficulties with the concept
of "field" which by many is regarded as fundamental to modern theo-
retical physics. We think of the field at any point of space as something
"real", independently of whether there is an instrument at the point to
measure it. But when we try to account mathematically for the fact that
our instrument apparently responds to what was there before we went
there with the instrument, we find that actually the instrument responds
to the modified state of affairs after the instrument is introduced. (This is
shown by an analysis of the Maxwell stresses.) Our attempt to give
instrumental meaning to something that exists in the absence of the
instrument seems foredoomed to failure — one can detect the odor] of
a logical inconsistency here. Yet our thinking seems to demand that we
HOW MUCH RIGOR IS POSSIBLE IN PHYSICS? 233
attach a meaning to what would be there in the absence of the instrument,
whereas meaning itself exists only in a context of instruments.
All the infelicities and ineptnesses which we have encountered up to
now arise because we have been trying to do something with our minds
which cannot be done. After long experience we have found how to deal
with situations of this sort after a fashion. We push the conventional line
of attack as far as we can, and when we presently run into conceptual
difficulties, we usually meet these difficulties, not by any drastic revision
of our conceptual structure, but by keeping as much of it as we can and
patching it up by rules explicitly warning of the limitations of the con-
ventional machinery. There is a certain resemblance between this general
situation and the special situations in quantum theory to which the Hei-
senberg principle is applicable. We cannot, for example, push our con-
ventional description of a physical system in terms of space and time too
far toward the microscopic without running into difficulties with our
description of the same system in terms of cause and effect, although on
the scale of daily life a description in terms of space and time is practically
synonymous with a description in terms of cause and effect. There are
many examples in quantum theory where we have to decide between
which of two mutually exclusive forms of description we shall employ.
Bohr sees all these as examples of the principle of complementarity, but
he regards this principle as something of much broader scope and of
deeper philosophical significance than as merely a principle limited in its
application to physical systems. Thus he speaks of the impossibility of
reconciling the demands of justice and mercy, and the presumptive
impossibility of making a physical analysis of biological systems suf-
ficiently searching to disclose the nature of life without destroying that
life, as examples of the general principle of complementarity. It seems to
me that it did not need quantum theory to disclose this general situation,
but that we have always had situations where we have been forced to
shift to another line of attack when we push our analysis to the logical
limit. In other words, the method of "yes-but" we have always had with
us. It seems to me that the generalized principle of complementarity is
merely a glorified version of the principle of "yes-but". The method of
"Yes-but" goes back at least to the time of Zeno, who, I will wager, was
as capable as the next man of catching the tortoise which he intended to
convert into stew for dinner, in spite of his paradoxes of motion. This
sort of thing it seems to me is too ubiquitous and too vague to warrant
our seeing here the operation of some grandiose "principle", nor do I
234 P. W. BRIDGMAN
believe that it materially increases the presumptive truth of quantum
theory to have discovered this sort of qualitative situation concealed
in the consequences of its analysis. In fact, if it had not found this sort
of thing it would be presumptive evidence against it. These strictures
must not be taken as in any way reflecting on the validity of the numerical
relationships demanded by quantum theory — these are an entirely
different sort of thing.
Whatever view we take of complementarity as a grandiose principle of
sweeping applicability, it seems obvious to me that here we have a factor
militating against sharpness, for the line separating, for example, a
legitimate space-time description from a deterministic description cannot
be sharply drawn. Whenever we encounter such a lack of sharpness we
may anticipate also a failure of the possibility of rigor.
All the situations which we have encountered thus far have a feature
in common. In all of them we have encountered failures of our intellectual
machinery to deal with experience as we obviously would like to have it
deal — in particular, our intellectual machinery has proved itself in-
capable of exactly reproducing what we see happen. For instance, our
verbalizing, or our mathematics, which is the same thing, has no built-in
cut-off, corresponding to error or to the finiteness of human experience.
In addition to this sort of failure of our mental machinery to exactly
reproduce features of experience which are fairly obvious and which are
often explicitly talked about, I think there is also failure for reasons not
usually appreciated or said out loud, reasons corresponding to demands
we ought to make of our mental machinery but which in fact we do not.
I think it will be admitted that an ideal mental machinery will not
employ the operation of isolation for the reason that isolation does not
occur in actuality. Quantum theory prohibits the isolation of the object of
knowledge from the instrument of knowledge, and successfully analyzes
the situations to which the Heisenberg principle applies in terms of the
reaction between instrument and object which are ignored when they are
isolated from each other. But any actual situation involves not only
instrument of knowledge and object of knowledge, but also the knower.
Quantum theory, however, consistently neglects the knower. Thus I find
the following quotation in a recent lecture by Professor Bohr : ' In every
field of experience we must retain a sharp distinction between the observer
and the contents of the observations/' But in the world of things that
happen this sort of distinction does not occur, and in making the distinc-
tion it seems to me that quantum theory practises a kind of isolation. In
HOW MUCH RIGOR IS POSSIBLE IN PHYSICS? 235
physics the knower is always there, whether I am concerned with myself
practising physics or whether I observe other people practising it. It may
well be that quantum theory is justified for its particular purposes in
taking the knower for granted, but we, in so far as we are committed to
the problem of describing and understanding the total scene, may not
neglect the knower. The problem of getting the knower into the picture
has become acute now that most of us have become convinced that the
knower is itself a physical system. Formerly, when people could think of
mental activity as the functioning of a special mind stuff, sui generis, and
with little in common with ordinary matter, it did not appear logically
absurd to hope to give an account of the one kind of matter independently
of the properties of the other. But now we are convinced that mental
activity accurs in physical structures of stupendous complexily, made of
the same atoms that the activity is seeking to comprehend. These com-
plexities, if anything, increase the urgency of understanding the nature of
the coupling between the structure of the brain and the external world.
The presumption that there is some sort of essential limitation because of
the nature of the structure and the coupling appears irresistible.
The concepts in terms of which we describe and understand the world
about us do not occur in nature, but are man made products. Such things
as length, or mass, or momentum, or energy occur only in conjunction with
brains. The significance of these concepts cannot be isolated and as-
sociated only with the external world, but the significance is a joint
significance involving external world and brain together. Now it seems
to me that it is quite conceivable that different properties of the brain
structure are involved in the concept of length, for example, than are
involved in the concept of mass. It might be that the concept of mass is
beyond the powers of certain simple types of brain whereas the concept of
length might be easily within them. If such were the case, or if our present
brain structure carries vestiges of limitations of this sort, our outlook
might be materially altered.
To completely answer the questions brought up by considerations of
this sort not only should we be able to hold ourselves to an awareness of
the indisoluble tie-in of brain structure with the external world, but we
should be able to describe specifically the nature of this tie-in for different
concepts such as mass or length. We are at present hopelessly far from
being able to do this, or even from knowing whether it is possible "in
principle". There is, however, something which we can now do which has
the effect of shifting the center of gravity away from the unknown
236 P. W. BRIDGMAN
contribution of the brain, so that a little more "objectivity" can be im-
parted to our physical concepts. If one examines what he docs when he
determines such physical parameters of a physical system as its mass or
its energy, it will be seen that the procedure involves the complicated
interplay of operations of manipulation in the laboratory and operations
of calculation. It is into these latter operations of calculation that the
unknown and questionable influence of brain structure enters. Suppose
now that we define the energy of a body, not as the number which is ob-
tained by combining in a certain way other numbers which may corre-
spond to velocity and mass, but as the number which is automatically
given when a certain type of instrument, an "energy measurer" is coupled
to the body. Such an "energy" is more something that we do and less
something that we say and think than the conventionally defined energy.
If we are clever we ought to be able to design instruments which would
automatically record on a scale, when coupled to the body, any of the
conventional physical parameters. When we have designed such instru-
ments we should be able incidentally to discover some of the limitations
in the measureability of energy, for instance, whereas it would be hopeless
to expect to find such limitations as long as we have to treat the limi-
tations as incidental to the structure of the brain.
I have made the beginning of an attempt to specify in detail how
instruments might be constructed which would automatically register on
a dial this or that physical parameter of an object when coupled to it.
It is evident that the instruments will fall into hierarchies, the higher
members of the hierarchy employing as component parts the complete
instruments of lower levels. An instrument for automatically recording
length is fairly easy to construct, whereas I found it to require great
complication to construct an instrument for indicating mass, and even so
there appear to be definite limitations on such features as speed of re-
sponse. This is in spite of the fact that it is just as easy to say mass as to
say length, and that in such an activity as dimensional analysis we think
of mass and length as of equal simplicity. When we define mass and length
instrumentally in this way, we see that, because of its greater complexity,
it will not be so easy to apply the mass measuring instrument to small
objects as the length measuring instrument, so that the concept of mass is
subject to limitations in the direction of the very small to which the con-
cept of length is not subject. This sort of limitation is entirely different
from the sort of mutual limitation of measurements of velocity and po-
sition, for example, controlled by the Heisenberg principle in quantum
HOW MUCH RIGOR IS POSSIBLE IN PHYSICS? 237
mechanics. It suggests itself that there may be other sorts of conceptual
limitations in making contact with the world than those treated by con-
ventional quantum mechanics.
The general problem back of all these later considerations is for the
knower to know himself. It has been recognized as a fundamental philo-
sophical problem since at least the time of Socrates, but it appears that
we have not got very far toward a solution. Recent developments make
it appear that the solution of this problem is more difficult than was
perhaps at one time optimistically assumed. For we have here a self-
reflexive situation, a system dealing with itself. Godel's theorem shows
that in the case of at least one special type of such a system there are
drastic and formerly unsuspected limitations. It does not appear un-
reasonable to suppose by analogy that there are also formidable diffi-
culties in the general case. I believe these difficulties appear the moment
one attempts a specific attack on the problem — in fact it is difficult to
even formulate what the problem is in self consistent language. It seems
to me that nevertheless the problem is one of the very first importance. I
think what I have said here makes it at least doubtful whether any
possible solution can be rigorous in the canonical meaning of rigor — I
believe that this will increase the difficulty of finding an acceptable so-
lution rather than decrease it, as might perhaps at first seem natural.
Until we have solved the problem, I do not believe that we can estimate
what the limitations arc on any possible rigor, nor even, for that matter,
know what the true nature of rigor is.
Symposium on the Axiomatic Method
LA FINITUDE EN MECANIQUE CLASSIQUE, SES
AXIOMES ET LEURS IMPLICATIONS
ALEXANDRE FRODA
Academic de la Rdpublique Populaire Roumaine, Institut de Mathematiques, Bucarest,
Roumanie
Le monde physique ne nous revele jamais, du nioins a notre echelle des
grandeurs, 1'existence actuelle de 1'infini. En particulier, Ton ne rencontre
en mecanique classique ni forces infinies, ni une infinite de renversements
du sens de mouvement d'un mobile materiel en un laps fini de temps.
C'est-ce qui nous a suggere Introduction de deux nouveaux axiomes en
mecanique et 1'ctude dc leurs implications. Nous les avons appele
axiomes de finitude (F).
A propos dc la negation de 1'infini (en mecanique classique) qui a
inspire ccs axiomes, on peut citer une des profondes remarques de E.
Mach sur revolution de la mecanique [8, 8; 34] : ,,Un des caracteres de la
connaissance instinctive", ecrivait-il, ,,c'est d'etre surtout negative. Ce
n'est pas predire ce qui arrivera que nous pouvons faire, mais seulement
dire les choses qui ne peuvent pas arriver, car celles-ci seules contrastent
violemment avec la masse obscure des experiences, dans laquelle on ne
discerne pas le fait isole".
On fera appcl aux nouveaux axiomes afin d'eclaircir une question, qui
s'est imposee a Tattention des physicicns, cles que Wcierstrass prouva
T existence de fonctions continues sans derivee. Or il est admis, en analyse,
que les fonctions non derivables ne sont nullement except ionnelles dans
la classe des fonctions continues. L'on admet, par contre, en mecanique
classique que tout mouvement possede une vitesse et une acceleration,
a tout instant, ce qui implique 1'existence des derivees pour toutes les
fonctions continues, qui definissent analytiquement les mouvements.
Soit
?=?') 0)
une equation vectorielle definissant la cinematique du mouvement JLL d'un
point materiel M de masse m, dans un laps de temps d = [to, t{\ . II y est
suppose, selon Newton, que / mesure physiquement, a partir d'un instant
238
LA FINITUDE EN MECANIQUE CLASSIQUE 239
intial t = to, le temps ,,absolu" et que Ic vecteur dc position r, situe le
point M par rapport a un systemc fixe d'axes cartesiennes constituant des
reperes de 1'espace ,,absolu".
L'on pose, en mecanique classiquc, pour la vitesse et 1'acceleration du
mobile a 1'instant t,
^fi^ — ^fi, A(t)=~1(t), (2)
ce qui les definit aussi comme fonctions vectorielles de t.
On y admet la continuite de r(t), qui resulte de notre intuition du temps
et du mouvement. Cette assertion ne sera pas mise en discussion, a cette
occasion. _^
La definition (2) de v(t) a un sens en mecanique classique par cc que la
fonction vcctorielle r(t) y est supposee derivable, propriete attribute a
tout mouvement. Afin de justifier la definition (2) de A(t)t il y est admis,
de plus, que v (t) est non seulement continue, mais aussi derivable, par
rapport a t, quel que soit t e 6.
Ainsi tout mouvement // present erait en mecanique classique des carac-
teres, qui ne sont demontrablcs ,,ni mathematiquement, ni empirique-
ment", comme I'affirmait G. Hamel [5a; 5b, p. 64; 5c, p. 2] en axiomati-
sant la mecanique rationnelle. En faisant remarquer, que ,,aucune ex-
perience ne serait assez fine pour descendre jusqu'au differentielles",
Hamel attribuait 1'existence de la vitesse et de Tacceleration d'un mouve-
ment fi, a tout instant t de d, a un principe physique, selon lequel : ,,Toutes
les grandeurs observables sont continues et continument differentiates".
Un principe pareil fut affirme par L. Zoretti [10, pp. 16, 17, 40], dans son
etude des principes de la mecanique classique.
Designons par Ra un axiome affirmant 1'existence d'une acceleration
— >
A (t) a tout instant t d'un mouvement //, du domaine C de valabilite de la
mecanique classique (de Newton), ce qui implique aussi Texistence d'une
vitesse v(t) continue.
II nous faudra distinguer entre les mouvements a definition purement
cinematique (1) et les mouvements fjic realisables en C, ce qui exige des
definitions explicites. En faisant abstraction des eventuelles resistances
passives (frottement viscosite, etc.) d'un mouvement reel fi de C, Ton
240 ALEXANDRE FRODA
fait correspondre a pun mouvement [tc ,, conservatif "*, qui est soit egal a p,
soit — lorsque p n'est pas conservatif — egal a la limite (cinematique) de
la suite des mouvements non-conservatifs obtenus en faisant tendre suc-
cessivement les resistances passives de p vers zero. Cela est considcre pos-
sible, sinon experiment alement, du moins theoriquement.
Soit (R) un systeme classique d'axiomes de la mecanique rationnelle.
Tout mouvement realisable en C y satisfait, mais la reciproque pourrait
ne pas etre vraie. En effet, un mouvement quelconque d'un point ma-
teriel M de masse m ctant donne par 1'cquation (1), il semble douteux
qu'un tel mouvement soit realisable, quelle que hit sa definition cine-
matique. C'est un fait que signalait deja H. Herz [7, p. 12], dans son etude
des principes de la mecanique. Nous aurons a revenir tout a 1'heure sur ce
point tout aussi important, que clelicat.
Le domaine C des mouvements, considered en mecanique rationnelle
du temps de Newton, fut ulterieurement rccluit par suite de la critique
des principes qu'il avait pose a la base de sa ,, philosophic naturelle",
critique stimulee par les progres ulterieurs de la physique. Newton avait
admis a la fois les principes suivants: 1) 1'existence cTun temps et d'un
espace ,,absolus", ainsi que la Constance ,,absolue" de la masse en mou-
vement et 2) Le mainticn des propriotes de la matiere j usque dans ses
parties ultimcs, ,, indivisibles". De ces principes, le premier est aujourd'hui
conteste par la mecanique de la relativite generale, le second par la me-
canique quantique (ondulatoire). En consequence, le domaine C de la
mecanique rationnelle est limite aujourd'hui par 1'existcnce de ces der-
nieres mecaniques. C'est pourquoi nos axiomes de finitude s'applkmeront
seulemcnt a la mecanique rationnelle, sans prejudice de leur eventuelle
extension aux mecaniques nouvelles. Or, de la mecanique du point on est
conduit a la mecanique des systemes en vertu d'axiomes que nous n'allons
pas examiner.
Signalons toutefois que la notion de point materiel pose elle-meme
des questions. Depuis Euler et Lagrange Ton admet souvent en mecani-
que classique, 1'existencc de points materials M aux dimensions nulles,
mais de masse m non-nulle. L. Zoretti [10] les appelle ,,fictifs", puis-
que Ton y neglige les proprietes rotationnelles d'un corps tres petit.
Mais il y a plus, 1'introduction de tels points peut conduire a des contra-
1 Un mouvement sera dit conservatif, par definition, s'il ne comporte pas de
degradation d'energic (due a des resistances passives). Pour le sens different,
classique, attribue a I'expression ,, systeme conservatif", voir par exemplc Appell P.,
Traite de mdcanique rationnelle, t. II, Ed. 4, Paris (1923), p. 65 et suivantes.
LA FINITUDE EN MECANIQUE CLASSIQUE 241
dictions. Considerons par exemple, abstraction faite des resistances pas-
sives, le mouvement de M le long d'une courbe F materielle, plane, verti-
cale, d'equation y = x*1*, ou Oy est la verticale. Si M doit etre presse
contre JT, de son cote concave, il faut que M soit un point geometrique,
puisque le rayon de courbure p egale zero au sommet de 1\ Or si la vitesse
v n'y etait pas nulle, la force de liaison y serait infinie, ce qui n'est pas
physiquement realisable.
L'on peut remarquer d'ailleurs, que 1'existence d'un point materiel
sans dimensions est tout aussi critiquable, que 1'admission d'existence
d'un instant reel t de temps, a duree nulle, qui peut, de meme, conduire a
des contradictions. Ces existences physiquement inconcevables sont par-
fois impliquees par 1'application de la methode infinitesimale en physique,
qui devrait — semble-t-il — etre 1'objet d'une analyse axiomatique, assez
difficile a faire.
De tellcs objections apparaissent aussi dans les mecaniques nouvelles.
Signalons ainsi, en passant, un passage significatif oil Heisenberg en
s'occupant de son principe d'incertitude exprimait, deja en 1930, des
doutes de principe sur la legitimate d'attribuer un sens physique au pas-
sage a la limite d'un volume ct d'une duree elementaires, lorsqu'il s'agit
d'evaluer 1'amplitude d'un champ electrique et d'un champ magnetique
en mecanique quantique [ 6, p. 37].
Revenons a notre problemc, qui est celui de debarasser les axiomes de la
mecanique classique de l< hypothhe Ra, definie ci-dessus. L'on y parviendra
en considerant d'abord dans les conditions les plus generales de 1' analyse
les grandeurs vectorielles, qui intervienncnt en mecanique et en recher-
chant ensuite les circonstances, qui imposcnt 1'existence des derivees,
lorsque les mouvements sont realisables en C. C'est ce que nous avons
entrepris dans un travail anterieur, en roumain [3, p. 3-4J.
Le dcveloppement du programme indique exige des notions de cine-
matiquc generale, que 1'on definit en etcndant aux fonctions vectorielles
W(t) d'une variable reelle t les proprietes classic^ues des fonctions (nurne-
riques) reelles de t, concernant la continuite, les borncs, les limites, la de-
rivation et 1'integration. II nous suffit d'en mentionner 1'analogie, que
reflete la terminologie respective.
Voici enfin quelques definitions de cinematique gencrale, qui nous ser-
viront a formuler les axiomes de la mecanique rationnelle. Considerons, de
nouveau, un mouvement ^ a definition cinematique (1). Par definition, la
— >
vitesse v(t) existe a 1' instant t, si pour Ai -> 0, cela a un sens d'ecrire
242 ALEXANDRE FRODA
W = lim - [>(* + JO -(/)] (3)
J<->0 ^"
et cette vitesse sera dite complete. De meme, il existe a I'instant t une vi-
— > ->
tesse prospective v+(t) ou retrospective v-(t), lorsque la limitc en (3) a un
sens pour At > 0 ou pour At < 0, respectivement. Considcrons, en parti-
culier, le cas d'un mouvemcnt ^, tel que v(t) existe et soit continue a
— >
chaque instant t de <5[/0, <i]. Par definition, 1' acceleration A(t) existe alors
a I'instant t, si pour At -> 0, cela a un sens d'ecrire
7(Q - lim — [v(t + At) - v(t)] (4)
j/->o ^
et cette acceleration sera dite complete. De meme, il existe a I'instant t
une acceleration prospective a(t) ou retrospective <x.(t), lorsque la limite en
(4) a un sens pour At > 0 ou pour At < 0, respectivement.
Nous verrons, que dans le domaine C cle la mecanique rationnelle la
vitesse complete v(t) existe a tout instant d'un mouvement realisable en C
et qu'il y a done alors
C'est pourquoi il est inutile de definir des ,, accelerations", en partant
— > — >
des vitesses prospective v+(t) et retrospective v-(t).
Aux notions cinematiques precedentes ajoutons encore les definitions
suivantes, afin d'abreger le langage:
1) On dira qu'un mouvement ju, a definition cinematique (1) est re-
gulier, en un laps de temps d — [to, t\\, s'il possede a chaque instant t de d
~>
une acceleration complete A (t) continue. _> ^
2) En considerant les mouvements /^ d'equation r — r(t) et p,n d'equa-
— > — >
tions r = rn(t), n = 1,2, . , . , Ton dira que jn est la limite cinematique des
— > — >
/iw, pour n -> oo, lorsque y(0 — limyn(0.
?1->CXD
3) On dira qu'un vecteur W7(0 change d' orientation dans I'espace une
infinite de fois en une suite indefinie a (croissante, resp. decroissante)
d'instants successifs ti, lorsqu'il existe un axe de 1'espace, tel que les pro-
LA FINITUDE EN MECANIQUE CLASSIQUE 243
-> — >
jections des W(ti)f W(tt+i) sur cet axe aient des sens opposes, pour une
infinite d'indices parmi les i = 1,2, ....
4) Lorsque dans un intervalle 6 = [to, t\\ des mouvements p\t ^ sont
— > — > — > — >
donnes par r = r\(t], r = r2(t), Ton dira que le mouvement p, donne par
r = r(t) est leur mouvement resultant, s'il y a r(t) = r\(t) +
5) Appelons polynomial un mouvement r = r(t), oil r(t) est un polynome
en ty dont les coefficients sont des vecteurs constants de 1'espace. Un tel
mouvement est regulier.
Ajoutons la remarque suivante: Les fonctions vectorielles continues de
t etant des limites (uniformes) de polynomes, tout mouvement /*, a de-
finition cincmatique (1), d'un point Mt peut s'exprimer (et de bien de
manieres differentes) comme limite cinematique d'une suite de mouve-
ments polynomiaux //w du meme point.
Revenons maintenant au systeme classique (R) des axiomes de la me-
canique. II est admis, en mecanique rationnelle, que la force F qui pro-
duit le mouvement ju, d'un point materiel M doit exist er a tout instant t de
6 (meme si F = 0) et que p est soumis, a cet instant, a la loi de Newton
(axiome Rn), qui s'ecrit
F = m.l\ (5)
ou m est la masse constante de M et F son acceleration a 1'instant t.
—>
E. Mach et P. Painleve n'ont vu en Rn, qu'une definition de la force F qui
produit le mouvement, mais G. Hamel y reconnut une relation effective,
car il exist e des classes (0) de phenomenes physiques, telles qu'il y ait,
pour chacune d'ellcs, lorsque m designe la masse constante du point ma-
teriel M une loi generale
F = m.0(r, v, t),
ou 0 est une fonction vectorielle des variables r, v, t attachee £ la classe
(0) — et y constituant bien souvent le vecteur d'un champ. S'il est ques-
— >
tion d'un mouvement ^ determine, repere* par (1) et tel que v(t) existe en 6,
-> — >
les formules (3), (4) montrent que, pendant le mouvement, il y a F = F(t),
244 ALEXANDRE FRODA
de sorte que la force qui produit p, est, en ce cas, egale a une fonction vec-
torielle de / et seulement de t.
L'existence d'une telle force, produisant un mouvement determine*,
realisable en C est assuree par un axiome Rd. On peut prouver que 1'ac-
— >
celeration F, dont 1'existence est admise en (5) n'y intervient que sous la
forme d'acceleration prospective a(t) : _+ ^
1) Voici un premier argument : S'il y avait F = A pour un mouvement
/* realisable en C et a tout instant t de 6, les discontinuites de F(t) de-
vraient avoir, en vcrtu de (5), les caracteres d'une derivee vectorielle A(t)
et ne pourraient etre done de premiere espece (c.-a-d. presenter un saut).
Or cela est centred it par des indications claircs de 1' experience physique,
comme le montrcnt les exemples suivants :
«. Considerons la force discontinue, qui met en mouvement le poids
equilibre de la machine d'Atwood, a 1'instant d'arret de la masse ad-
ditionnelle. Cette force execute un saut.
b. Considerons la force discontinue, qui produit le mouvement d'un
point soumis a r attraction newtonienne d'une surface sphcrique fermce,
a 1'instant oil il la travcrserait. Cettc force execute un saut.
2) Voici un second argument: Lorsque 1'action d'une force cesse de
— »
s'exercer (F — 0) sur un point materiel M, il continue a se deplacer d'un
mouvement rectiligne et uniforme en vertu de sa vitesse acquise. Or si Ton
avait, en (5), 1'egalite F = A(t), a chaque instant t de 6, la force F serait
predetermince a I1 instant /o p^ir le mouvement antericur a IQ, puisque 1'ac-
— > — > ->
celeration A (to) = a(/o) est donnee par les valeurs de r(t), pour t < IQ.
Cela est non seulement paradoxal, mais devient evidemment absurde,
lorsque /*"(/) est discontinue pour / — IQ et de plus, contrcdit la conception
d'une force, cause de modification du mouvement d'inertie, dont 1'exis-
tence est assuree par un axiome Ri.
Citons aussi deux axiomes de (R), sc completant 1'un 1'autre: L'axiome
Re affirme que le mouvement resultant (cinematique) de mouvements
realisables en C est aussi realisable en C et 1'axiome Rf affirme que sous
Faction simultanee de plusieurs forces, c'est la force egale a leur resultante
vectorielle, qui les remplace.
Une question delicate, deja signalee en passant, est la suivante: Le
systeme (R) des axiomes classiques, y inclus Ra, est-il aussi une condition
LA FINITUDE EN MECANIQUE CLASSIQUE 245
suffisante pour que tout mouvement, defini cincmatiquement fut aussi
realisable en C ? Cela n'est pas du tout vraisemblable et il est meme dou-
teux, qu'il puisse exister un systeme d'axiomes representant des condi-
tions nccessaires et suffisantes afin que tout mouvement JLL les satisfaisant
fut realisable en C.
Nous pouvons enoncer pourtant des conditions suffisantes pour que
certains mouvemcnts soient realisables en C. Les voici, sous forme de
propositions que Ton peut demontrer sans faire appel aux axiomes de la
mecanique rationnelle ; mais par une methode constructive :
I. Tout mouvement polynomial d'un point materiel M est realisable en C.
II. Tout mouvement JLI est realisable en C en meme temps que ses
projections ^LX, fiy, //2 sur les axes.
III. Lorsqu'un mouvement ju d'un point materiel M, defini cinema-
tiqucment en un laps de temps d = [/o, li] possede une acceleration
— >
prospective a(t) continue et qui ne change pas une infinite de fois son
orientation dans 1'espace, le mouvement // est realisable en C.
Les demonstrations des propositions I, II et III consistent a mettre
en evidence la possibilite de construire le mouvement JLL respect if a 1'aide
de mecanismes convcnablcs, quand on fait abstraction des resistances
passives. Ces constructions jouent le role de modeles cxistentiels.
Nous pouvons enoncer aussi des conditions necessaires a ce qu'un mou-
vement soit realisable en C, en completant d'une part le systeme (R) avec
les axiomes de finitude (F), tout en abandonnant d'autre part I'axiome
Ra, qui affirmait a priori Texistence de Tacceleration a tout instant d'un
mouvement realisable en C.
Voici enfin 1'enonce de nos axiomes de finitude, valables enmecanique
rationnelle.
(F). Lorsqu'un mouvement p est realisable en C, dans un laps de temps d,
fini, il satisfait aux conditions:
F\. Parmi les suites de mouvements reguliers, realisables, ayant pour
limitc cincmatique le mouvement /*, il existe une suite de mouvements
/J>n=i,2,-», tels que les forces Fn qui les produisent soient bornees dans leur
ensemble. _^
F2. La force F(t), qui produit le mouvement JLL en d ne peut changer
d'orientation une infinite de fois, en aucune suite indefinie a (croissante,
resp. decroissante) d'instants successifs.
On doit remarquer que 1'adjonction des axiomes de finitude (F) a un
systeme classique (R) d'axiomes du mouvement en C doit etre effectuee,
246 ALEXANDER FRODA
en mcme temps que 1'abandon de Faxiome Ra. Mais Ton ne peut renoncer
a 1'hypothese d'existence de 1' acceleration, qu'en modifiant a la fois Tex-
pression d'autres axiomes de (R), afin de ne plus admettre explicitement
1' existence des v(t) et A(t), pour tout t e d. Le nouveau systeme (R)
d'axiomes, ainsi obtenu, remplacera (R) et nous allons en exposer les
principales implications, oil le role des axiomes (F) est essentiel.
Afin de les obtenir, on s'appuiera sur des proprietes generates des
fonctions vectorielles de /, ainsi que sur les propositions I) II) III) ci-
dessus, qui expriment des conditions suffisantes, afin que certains
mouvements soient realisables en C. L'on obtient les resultats suivants,
qui expriment des proprietes appurtenant a tout mouvement p, realisable en
C:
1°. II existe a chaque instant t de 6 une vitesse complete v(t) continue
— > — >
et des accelerations prospective a(t] et retrospective a(/).
2°. L' acceleration prospective a(t) est prospectivement continue et ne
possede qu'un nombre fini d'instants de discontinuity en un laps d fini.
3°. L'acceleration complete A (t) existe et est continue a chaque instant
— > — >•
/ de 6, sauf en un nombre fini (ou nul) d'instants t^ de d, oil a(t), a(t) sont
discontinues et a(tjc) ^ a(^).
4°. To itt p, est, en d, soit un mouvement regulier, soit une succession finie
de mouvements reguliers.
La demonstration des proprietes precedentes utilise 1'appareil mathe-
matique de la theorie des fonctions vectorielles de variables reelles [1].
A part quelques propositions connues, ou qui etendent directement aux
fonctions vectorielles des proprietes classiques des fonctions numeriques
de variable reelle, nous avons fait appel bien sou vent a une proposition
inspiree par ces recherches mcme et que voici:
,,Lorsque parmi les vecteurs derives prospectifs (resp. retrospectifs)
d'une fonction vectorielle V(t) pour t = to, fonction possedant des vec-
teurs derives, bornes dans leur ensemble en 6 — pi, t%\, to e 6, ily a deux
— > — >
vecteurs derives Z)i, Z>2, faisant entre eux un angle non-nul, il existe un
vecteur variable W(rp) egal a la derivee vectorielle unique
LA FINITUDE EN MECANIQUE CLASSIQUE 247
pour t = TP et qui change son orientation dans 1'espace une infinite de
fois, dans unc suite de valeurs TP, p = 1,2, . . . , tendant vers IQ en de-
croissant (resp. en croissant)".
J'ajoutc, que — par definition — un ,,vecteur derive prospectif" du
vecteur variable V(t), pour t = to, correspond, par analogic, a Tun des
coefficients differenticls d'une fonction reelle /(/) a droite, tandis que la
,,dcrivee vectorielle" correspond a la derivee unique, pour t = to.
Les resultats precedents de 1 ° a 4° font directement appel aux axiomes,
notes precedemment par Rd, Ri, Rn, Re, Rf, Fl, F2 et n'utilisent pas
1'axiome Ra. Les autres axiomes Rr, sur 1'egalitc de Faction et de la reac-
tion et Ru, qui assure 1'unicite d'un mouvement realisable en C, pour des
conditions initiales donnees sous 1'action de forces donnees, n'y inter-
viennent pas, du moins explicit ement. Or on a vu que certains axiomes de
(R) doivent etcr exprimes sous une forme modifiee, avant de former avec
les axiomes (F) le nouveau systeme (R). Voici des exemples, qui font ap-
paraitrc les modifications en question :
1°. Axiome Ri (loi d'inertie) : ,,Lorsqu'a un instant initial to du laps d,
— >
un point materiel M posscde une vitesse retrospective V-(to) et qu'aucune
force ne s'excrce sur lui pendant d, le point M decrit en d un mouvement
— > —>
rectiligne de vitesse v(t) constamment egale a v-(to)".
En usant de la vitesse retrospective intiale, Ton evite de reintroduire
(memc sous une forme affaiblie) 1'hypothesc d'existence de la vitesse et il
suffit en effet, d'admettre physiquement, qu'on dispose du mouvement de
M dans un laps de temps, aussi petit qu'on veut, anterieur a £o-
2°. Axiome En (loi de Newton) : ,,Si dans un mouvement p realisable
en C d'un point materiel M, possedant a chaque instant t e d une vitesse
—>
v(t) continue, il existe a un certain instant t\ e d une acceleration prospec-
—> — >
tive a et si la force, qui s'exerce sur M est F, il y a
— > — >
F = m.a,
oil m est une constante, dependant de M et independante du mouvement
f.'d-d. de t." _+ ^
II est clair que Texistence de 1'acceleration prospective a = a(ti) n'est
admise, dans cet enonce, que pour lf instant t — t\.
Les axiomes de finitude (F) ne pretendent pas a etre acceptes, comme
Taxiome Ra, sans confronter 1'experience. L'on peut concevoir des ex-
248 ALEXANDRE FRODA
periences dont les resultats previsibles constituent une verification de ces
axiomes. Voici le schema d'une de ces experiences, ayant lieu sans re-
sistances passives et utilisant des solides parfaitement elastiques:
Soit un petit pendule simple vertical P, dont les oscillations sont limi-
tees de chaque cote par des obstacles plans, verticaux, symetriques par
rapport au plan vertical V contenant 1'axe de suspension. Ces obstacles
sont relies au sol, de maniere a posseder des mouvements uniformes,
autonomes, independants des chocs du petit pendule et tels qu'ils arrivent
simultanement a 1'instant t\ en V. Selon les lois classiques du choc le
pendule devrait effectuer une infinite de demi-oscillations, d' amplitudes
decroissantes, en un laps 6 = [to, /i], ce qui contredirait F2. Or, en realitc,
le nombre d'oscillations, ne pent ctre que fini, ce qui se verifie aisement,
si Ton tient compte de la duree des chocs, calculable scion la loi de Herz.
C'est pourquoi 1'axiomc F2 sera verifie par cette experience. Negliger
la duree des chocs engendre des paradoxes, comme celui remarque par
D. Gale [4], qui pensait avoir signale un cas d'indetermination en mc-
canique classique.
On pcut aussi etablir par le raisonncment I'mdependance des axiomes
(F) par rapport au systeme classique {(R) — (Ra)} d'axiomes.
La question, que pose r extension eventuelle des axiomes de finitude,
valables en C, aux phenomenes etudies par les mecaniques notivelles est
un probleme ouvert.
On peut rapport er a ce probleme quelques faits bicn connus, elemen-
taires, qu'on peut relier aux axiomes (F) et a lours implications en C:
1) Dans la mecanique cle la relativite generate, oil la masse est fonction
de la vitesse du mouvement, il y a la vitesse c de la lumiere, qui pose une
borne finie a la vitesse de tout mouvement, ce qui doit affccter 1'expres-
sion de I'axiome Fl.
2) En mecanique quantique Ton se rappelle que 1'existence d'une vi-
tesse a ete mise en doute, des les premieres etudes du mouvement brow-
nien. Ainsi, en experimentant, J. Perrin signalait une analogic d'aspect
du mouvement brownien aux fonctions sans derivees [9, p. 164], ce qui
confirmait les vues de Einstein, lequel, dans ses etudes theoriques, avait
demontre auparavant que la vitesse moyenne en At du mouvement d'une
particule ne tend vers aucune limite, lorsque la duree At tend vers zero.
II concluait en faisant remarquer que pour 1'observateur de ce mouvement
la vitesse moyenne lui apparaitrait comme vitesse instantanee, mais qu'en
fait elle ne represente aucune proprieteobjectivedu mouvement soumisa
LA FINITUDE EN MECANIQUE CLASSIQUE 249
1' investigation, du moins si la theorie correspond aux fails, ajoutait-il [2J.
Par ces paroles d'extreme prudence, je conclus aussi mon expose.
Bibliographic
[1] BOURBAKI, N., Fonctions d'une variable reelle. Livre IV, Chap. I, II, III, Pans
1949.
[2] EINSTEIN, A., Zur Theorie der Brownschen Bewegung. Annalen der Physik,
Serie 4, Vol. 19 (1906), pp. 371-381.
[3] FRODA, A., Sur les fondements de la mecanique des mouvements realisables du
point materiel (en roumain). Studii §1 Cercetari Matematice, t. Ill, Bucurc§ti
1952.
[4] GALK, D.t An indeterminate problem in classical mechanics. Amer. Math.
Monthly, vol. 59 (1952), pp. 291-295.
[5] HAMKT,, G., a) Ueber die Grundlagen der Mechanik. Math. Annalen, Bd. 66
(1908), pp. 350-397.
b) Elementare Mechanik. Leipzig, 1912.
c) Die Axiome der Mechanik. Handbuch der Physik, Bd. 5, Berlin 1927, pp.
1-42.
[6] HEISKNBERG, W., Die Physikalischen Prinzipien der Quanten-theorie. Leipzig
1941.
[7] HKKZ, H., Die Prinzipien der Mechanik in neuem zusamtuenhange dargestellt,
Gesammelte Werhe. Bd. Ill, Leipzig 1910.
[8] MACH, E., La Mecanique, expose historique et critique de son developpement
(trad. Km. Bcrtrand). Pans 1904.
[9J PERRJN, J., L' A tome. Pans 1912.
[10] ZORETTI, L., Les principes de la mecanique classique. Memorial des Sciences
Math., Paris 1928.
Symposium on the Axiomatic Method
THE FOUNDATIONS OF RIGID BODY MECHANICS AND
THE DERIVATION OF ITS LAWS FROM THOSE OF
PARTICLE MECHANICS
ERNEST W. ADAMS
University of California, Berkeley, California, U.S.A.
1 . Introduction. This paper has three purposes : ( 1 ) to give a system of
axioms for classical rigid body mechanics (henceforth abbreviated
'RBM')', (2) to show how these axioms can be derived from those of
particle mechanics (abbr. 'PM')] and (3), using the foregoing derivation
as an example, to give a general characterization of the notion of 're-
duction' of theories in the natural sciences. The axioms to be given are due
jointly to Herman Rubin and the author. They comprise what may be
thought of as the theory of rigid motions under finite applied forces with
moments of inertia given. That part of RBM which deals with the calcu-
lation of moments of inertia from known mass distributions is omitted,
since the laws of motion can be stated directly in terms of total masses
and moments of inertia. Similarly, the theory of impacts, which cannot
be represented in terms of finite forces, is excluded. In the axioms, the
laws of rigid motion are presented de novo, and are not, as is usually the
case, presented as deductive consequences of the laws of PM. It is our
contention that, in spite of superficial differences from the more well-
known examples, the derivation ot the laws of RBM from those of PM
can be viewed as an example of reduction. In section 3 we shall analyse
the logical relation which must hold between two theories in order that
one should be reduced to the other, and then in the final section we shall
indicate how RBM may be reduced to PM in accordance with the theory
of reduction previously given.
Because of limitations of space, our discussion both of the theories of
RBM and PM and of the general concept of reduction and its specific
application to RBM and PM will be limited. A complete formal develop-
ment of these topics is given in the author's Ph. D. dissertation, The
Foundations of Rigid Body Mechanics [1],
2. Axioms of Rigid Body Mechanics. Our axioms are based on seven
primitive notions, five of which are closely analagous to the primitive
250
LA FINITUDE EN MECANIQUE CLASSIQUE 251
notions of McKinsey, Sugar, and Suppes' axiomatization of classical PM
[5]. These seven are denoted 'K', 'T', 'g', 'R', 'H', 'p, and '0', and their
intended interpretations are as follows:
K is a set of rigid bodies.
T is an interval of real numbers representing clock readings during an
interval of time.
g is a function from K to the positive real numbers, such that for every
rigid body k in K, g(k) is the mass of k as measured in some fixed units.
R is a function from K X T to Er (the set of ordered r-tuples of real
numbers) such that for each k in K and t in T, R(k, t) is the r-vector
representing the position of the center of mass of k at the instant when
the clock reads t, as measured relative to a system of cartesian co-
ordinate axes, r-vectors are here construed to be ordered r-tuples of
real numbers, and, of course, in the ordinary application, r = 3.
H is a function from K x T x N (N being the set of positive integers)
to Er x Er> such that H(k, t, n) represents the rith applied force acting
on body k at the time / in the following way : H (k, t, n) is the r-vector
representing the magnitude and direction of the n'th applied force, and
H2(k, t, n) is the r-vector representing the position of the point of
application of this force relative to a specially selected system of co-
ordinate axes which are parallel to the original reference frame, but
which have their origin at the center of mass of k. We shall call the
original axes the 'axes of the space/ and the new axes the 'non-ro-
tating axes of k'.
[JL is a function from K to the set of r by r matrices with real components,
such that for each k in K, ju(k) is a matrix representing the moment of
inertia tensor of k relative to still another set of coordinate axes, which
we shall call the 'rotating axes of k.' The rotating axes of k are a system
of cartesian coordinate axes which have, like the non-rotating axes of
k, their origin at the center of mass of k, but which rotate with k so
that they always maintain a fixed relation to the parts of k. If k is
composed of a finite number of mass points with masses mi, . . . , m^
and positions LI, . . . , LI relative to the rotating axes of k, then the
matrix jn(k) is the sum of the products :
1 The transpose L* of an r- vector L is a 'column vector' with r rows, and the
dyadic product of a column vector L* and a row vector, say M (both r- vectors) is
252 ERNEST W. ADAMS
The matrix ju(k) defined as above is symmetric and positive semi-
definite since all of the w/s are positive. It is to be particularly noted
that moment of inertia, as characterized here, is independent of time,
because of the fact that it is defined relative to the rotating axes of
k, which always remain fixed within k. To transform to the time-
dependent moment of inertia function used in many formulations of
the laws of RBM (e.g., Milne [7], p. 267 or Joos [3], p. 137 or McConnell
[4], p. 233), it is necessary to introduce our last primitive notion, a
function representing the orientation in space of the rotating co-
ordinate axes of k.
0 is a function from K x T to the set of r by r orthogonal matrices, such
that for each k in K and t in T, 0(k, t) represents the 'orientation' of the
set of rotating coordinates of k at time t relative to the axes of the
space. 0(k, t) gives the orientation of the moving axes in the sense that
for each / — 1 , . . . , r, 0(k, t)j — the /'th row of the matrix 0(k, t) — is
the unit vector in the direction of the /'th axis of the moving axes of
k at time t. Or, 0(k, t)i j is the cosine of the angle between the i'th
rotating axis of k at time t and the /'th axis of the space.
The equation which relates the time dependent moment of inertia
function //(&, t) and the time-independent moment of inertia function p
is simply:
p(k,t) =0*(k,t)fA(k)0(k,t).
The axioms for RBM can now be stated in terms of the seven primitive
concepts just discussed. The style in which these axioms are formulated is
very similar to that of the axioms for classical particle mechanics due to
McKinsey, Sugar, and Suppes [5], and the axioms for relativistic particle
mechanics due to Rubin and Suppes [10] (see also, McKinsey and Suppes
[6]). That is, the axioms are conditions which are parts of the definition of
the set-theoretical predicate system of r-dimensional rigid body mechanics.
Our axioms rely directly on the concept of a system of r-dimensional
particle mechanics, which is defined as follows :
DEFINITION 1. An ordered quintuple <P, T, m, S, F> is a SYSTEM OF
CLASSICAL ^-DIMENSIONAL PARTICLE MECHANICS if and only if it satisfies
axioms P1-P6.
PI. P is a non-empty finite set.
P2. T is an interval of real numbers.
an r by r matrix (L)*(M) such that the clement of its i'th row and ;'th column is
FOUNDATIONS OF RIGID BODY MECHANICS 253
P3. 5 is an r-vector valued function with domain P x T such that for all p
in P and t in T, d*ldfi(S(p, t)) exists.
P4. m is a positive real-valued function with domain P.
P5. F is an r-vector vahied function with domain P x T x N, where N is
the set of positive integers, and for all p in P and t in T the series
00
2 F(p, t, i) is absolutely convergent.
?.=i
P6. For all p in P and t in T,
In the above axioms, P is to be thought of intuitively as a set of par-
ticles, 7^ — again — is an interval of clock readings, m(p) is the mass of
particle p, S(p, t) is the r-vector representing the position of p at time /,
relative to a system of cartesian coordinate axes, and F (p, t, i) is an
r-vector representing the magnitude and direction of the z'th force applied
to p at time t (in the case of particle mechanics it is not necessary to take
into account the point of application of a force since this affects only the
rotation, and not the translation of a particle). The only axiom embodying
what is normally thought of as a 'physical law' is P6, expressing a version
of Newton's Second Law. The first five axioms serve only to define the
set-theoretical character of the primitive notions, and state certain
continuity and differentiability conditions.
The axioms for RBM are stated in Definition 2, below. It will be seen
that only two of them contain ordinary physical laws, and the remainder,
like axioms P1-P5 in Definition 1, stipulate the set-theoretical type of
the primitives. Axiom Rl, stating that the first five elements of a system
of RBM are themselves a system of PM, contains Newton's Second Law,
since the axioms for PM include this law; and axiom R5 is a version of
well-known tensor equations relating moment of inertia, angular acceler-
ation (these two being combined to give the rate of change of angular
momentum), and resultant torque, or moment force.
DEFINITION 2. An ordered septuple <K, T, g, R, H, p, 0> is a SYSTEM OF
^-DIMENSIONAL RIGID BODY MECHANICS if and only if satisfies axioms
R1-R5.
Rl. H is a function with domain K x T x TV taking as values ordered
pairs of r-vector s, and if H1 and H* are r-vector valued functions with
254 ERNEST W. ADAMS
domain K x T X N such that for all k in K, t in T and i in N,
then </£, T, g, R, Hly is a system of classical r-dimensional particle
mechanics.
R2. 0 is a function with domain K x T taking as values r by r orthogonal
matrices, such that for all k in K and t in T, d2/dt2(0(k, t, )) exists.
R3. IJL is a function with domain K x T taking as values r by r symmetric
positive semi-definite matrices of rank r or r — 1 .
R4. For all k in K and t in T, the series
00
2 H2(k, t, i,) X Hl(k, t, i) is absolutely convergent. 2
/-=i
R5. For all k in K and t in T,
r d2 ~i °°
0(k, t) X \j*(k) -- - (0(k, /)) J = 2#a(*. ^ ^ X m(k, t, i).
The axioms of Definition 2 are all of fairly simple significance. Rl states
essentially that the system which is formed by taking only the masses,
positions of the centers of mass of the rigid bodies, and the magnitudes and
directions of the applied forces, constitutes a system of particle me-
chanics; i.e. it obeys the laws of particle mechanics. This axiom can be
regarded as a version of the theorem that the center of mass of a system
of particles or a rigid body moves as though all the mass of system were
located there, and all of the forces applied there. R2 specifies that 0(k, t)
is an r by r orthogonal matrix, as is required by the intended interpre-
tation, since the rows of 0(k, t) form a set of orthogonal unit vectors in
the directions of the moving axes of k. It is necessary that this function be
twice differentiate with respect to time in order that the rotational motion
of the body be describable as due to finite applied torques. This axiom
also rules out impacts, in which there may be discontinuous changes of
angular momentum, and for which the angular acceleration does not exist.
The symmetry and positive scmi-definiteness of ju(k) required by axiom
R3 follows also directly from the intended interpretation of this concept
(see p. 3). The restriction on the rank of the matrix ju,(k) amounts to a
2 The matrix cross-product AXB of two vectors A and B is defined as the
difference A*B — B*A. This is a skew-symmetric matrix, corresponding to a
symmetric double tensor. In three dimensions the matrix AXB depends only oil
three independent components and is closely related to the three-dimensional
vector cross product representing the -moment of a force B applied through a lever
arm A .
FOUNDATIONS OF RIGID BODY MECHANICS 255
restriction on the 'dimension' of the rigid body k. It can be shown that if
all masses are positive, and p,(k] is defined as on page 3, then the rank of
/j,(k) is equal to the dimension of k, defined as the dimension of the smallest
'hyperplane' of Er containing all of the points of k.
CO
Axiom R4 requires that the sum ^H2(kft,i) x H1(k,t,i)f which
represents the resultant moment force applied to k at time t relative to
the fixed coordinate axes of k, be absolutely convergent. This requirement
is put on H simply in order that the resultant moment or torque should
not depend on the ordering of the applied forces.
Finally, axiom R5 is a formulation of the well-known law equating rate
of change of angular momentum and moment force. The matrix ex-
[d ~|
ju,(k) -j-(0(k,t) , and the ex-
pression on the left side of the quation in R5 is the first time derivative
of this angular momentum, which is according to this equation equal to
resultant moment force.
Remark 1. The axioms for classical particle mechanics (Definition 1)
do not contain any version of Newton's Third Law, nor does any version
of it occur in axioms R2 to R5, and therefore our axioms for RBM do not
include this law. This omission may seem strange in view of the fact that
this is the law which justifies neglecting the internal forces acting between
the parts of a rigid body in computing its motion. Two comments are in
order here. First, if Newton's Third Law were not true (as it applies to
internal forces within rigid bodies), it would only be necessary to represent
all forces, internal as well as external, by the function //, and the equa-
tions of linear and angular acceleration (axioms P6 and R5) would still
hold true. Second, the fact that the Third Law is true justifies the omis-
sion of the internal forces, and representing only the external forces by //.
It would become necessary to include Newton's Third Law if a distinction
between external and internal forces were made within this system, and
then the force and moment force occurring in the equations of motion
were defined to be the resultants of the external forces only.
Remark 2. Although Newton's Third Law is not included, our axioms
satisfy two criteria of adequacy for mechanical theories. First, the well
known laws of rigid motion, such as Euler's equations (Whittaker (12),
p. 144), and the tensor forms, as well as the much simpler laws for two-
dimensional rigid motion are derivable from our axioms. Second, it can be
256 ERNEST W. ADAMS
shown that our equations are deterministic in the sense that if the initial
positions and velocities of the bodies are arbitrarily prescribed, and the
applied forces are given, then the paths of the bodies are uniquely
determined.
Remark 3. It is to be observed that, although the primitive notions //
and 0 are both defined in terms of position vectors and mass in their
intended interpretations, the only formal connection between moment of
inertia, angular position, and mass and position stated in the axioms is
through axioms R5, specifying a connection with moment force, and
axiom Rl (including Newton's Second Law), which in turn links resultant
force with acceleration and mass. If the rotational and translational
concepts were completely independent, this would have the odd conse-
quence that it would be possible to transform the coordinate axes of the
space by, say, a Galilean transformation and change the unit of mass
measurement, without this being accompanied by a corresponding
transformation in the amount of inertia and angular position functions.
In turn, if the transformations of ^ and 0 were independent of those
of space and mass, then the former could not be regarded as tensor
quantities in the usual sense, with prescribed transformation laws. As was
noted above, the translational and rotational concepts are not completely
independent in this theory, since they are both linked to force. The author
has not so far been able to determine, however, whether the two equations
of motion place sufficient constraint on the two kinds of functions, so that
the transformations of the mass and position functions uniquely determine
the transformations of the moment of inertia and angular position
functions.
3. Reduction. A first glance at the usual derivation of the laws of
RBM from those of PM suggests that the reduction of RBM to PM
consists in the following: first, the primitive notions of RBM are defined
in terms of those of PM, as is indicated roughly in the intended inter-
pretations of the primitives of RBM, and then the laws of RBM are
shown to be derivable from those of PM, supplemented by the indicated
definitions. Upon closer inspection, however, two difficulties appear in
the above simple theory of reduction. The first difficulty is of a technical
rather than of a conceptual nature, but is worth noting, none the less.
This is simply that there are, literally, no primitive concepts in the two
theories we have considered. The two theories formulated in Definitions
FOUNDATIONS OF RIGID BODY MECHANICS 257
1 and 2 are, of course, no more than definitions, and the letters €P ,'T',
'm', 'S', 'F\ and 'K' ', 'g', 'R', 'Hl ', '//, and '0* are actually only variables
employed in the definitions of the predicates 'system of classical r-
dimensional particle mechanics/ and 'system of r-dimensional RBM.'
Each theory, in other words, involves only one new term. This apparent
difficulty is circumvented by simply replacing the definitions of the
various concepts of RBM by a single definition which combines all of
them, and which defines the predicate 'system of RBM' in terms of
'system of PM.' We shall not pursue this problem here, however, but turn
our attention to the second difficulty, which is more serious.
Why, one may ask, should one bother to define the concept of a system
of RBM in terms of that of a system of PM, when, in fact, both are
defined in terms of the concepts of pure mathematics, as they are in
Definitions 1 and 2? If it is the case that the concept of a system of RBM
is definable in terms of set-theoretical concepts alone, as in Definition 2,
and the laws of RBM follow from those of set theory augmented by the
definition in question, then it should follow, according to the theory of
reduction just proposed, that RBM is reducible to set theory.
On intuitive grounds, any definition of 'reduction' which has a con-
sequence that some physical theory is reducible to set theory and analysis,
seems unacceptable.
The solution we shall propose to the difficulty raised above (assuming it
is felt to be one) involves a revision of the concept of a theory which we
have been tacitly assuming up until now; i.e., that a theory — in particu-
lar the theories of RBM and PM — is simply the set-theoretical predicate
defined by its axioms. 3 4 This revision is suggested by a closer examina-
tion of the situation which prevails when one theory is reduced to another.
The reduction of RBM to PM involves more than an arbitrary formal
definition of the concepts of the former theory — moment of inertia and
angular position — in terms of those of the latter, from which the laws
of RBM can be shown to follow. As Nagel [8] has pointed out, these 'defi-
nitions' are actually empirical hypotheses, and as such, ones which might
3 Since the set theoretical predicate is determined by the axioms, and conversely
it determines the axioms in the sense that the axioms are simply statements which
are true of all and only those entities which satisfy the predicate, it makes little
difference whether theories arc constnied as set -theoretical predicates or as sets of
axioms. Thus, one would not expect to get around the difficulty by simply going
over to the linguistic version of a theory, which construes it as a set of axioms plus
all of the theorems derivable from the axioms.
4 See [6] for a discussion of this concept of a theory.
258 ERNEST W. ADAMS
be false. There is, however, nothing in the account so far given of theories
and their mutual relations which takes into account the fact that theories
and the hypotheses represented by the 'definitions' involved in the
reduction of one theory to another may be either true or false. Our first
step, then, in analyzing the logic of reduction, will be to elaborate the
concept of a theory in such a way that it will be possible to speak of its
truth or falsity. 5
There are undoubtedly many ways of bringing the concept of truth or
correctness into formal consideration. One way, for example, is to require
that the axioms be consistent with a set of observation sentences. In any
case there must be some kfnd of reference beyond the axioms themselves
to the 'things' they are supposed to describe, or to observations about
those objects. We have chosen to approach this through the notion of an
intended interpretation or an intended model of the theory. Very roughly
speaking, an intended model of a theory is any system which, for one
reason or another, it is demanded that the axioms conform to. There will,
in general, be a large number of systems which satisfy the axioms of a
theory, but usually for theories in empirical science only a few of these
will be intended applications or intended models. For example, in the
case of classical PM , axiomatized in Definition 1, the ordered quintuple
<P, T, m, S, Fy such that
P = {\}
T = [0, 1]
m(\) = 1
S(\,t) = <0, 0, 0> 0 <t < 1
F(l, t, n) = <0, 0, 0> 0 < t < 1 ; n — 1, 2, 3, . . .
5 Some readers will object to speaking of the truth or falsity of a theory, and
would prefer to use the terminology of confirmation. To include the concept of the
confirmation of a theory relative to a given set of data would be to proceed in the
same direction we propose to go: i.e., to include some connections between the
fundamental or defined concepts of the theory and either observation or observation
sentences, which alone will determine either the truth or the degree of confirmation
of the theory. However, the theory of confirmation is at present in such an imperfect
state, as it relates to theories of high complexity, that it would be extremely
difficult if not impossible to found a precise analysis of reduction on it. On the other
hand, the work of Tarski [1 1J and others on the concepts of truth and satisfaction and
others relating to the interpretation of formal systems makes these concepts ideal
tools for use in precise logical analyses. Our use of the concept of truth rather than
confirmation is thus dictated by the requirements of logical precision; it does not
imply that the author believes that in any 'ultimate' sense the concept of truth is
fundamental and that of confirmation only derivative.
FOUNDATIONS OF RIGID BODY MECHANICS 259
satisfies the axioms of PM, though it is not normally taken as an intended
model simply for the reason that 1 is not a particle. On the other hand,
the system in which P is the set of planets of the solar system together
with the sim, m gives the masses of these objects (in some fixed units), S
gives their locations relative to a system of cartesian coordinate axes fixed
with respect to the fixed stars, and F gives the gravitational forces acting
between the sun and planets, is an intended model of PM (or, at any rate,
was often taken to be one.) It is this second kind of intended model which
it is expcctes should satisfy the axioms, and the axioms or the theory is
judged true or false according as the intended models satisfy the axioms
or not.
If truth and falsity arc to be defined, we have seen that two aspects of a
theory must be brought into account: first, the formal aspect which
corresponds to the set-theoretical predicate defined by the axioms (since
we wish later to avoid reference to linguistic entities, such as predicates,
we shall instead consider the extension of this predicate, which is the
set of all systems satisfying the axioms) ; and second, the applied aspect,
corresponding to the set of intended models. Formally, a theory T will
be construed as an ordered-pair of sets T = <C, /> such that C is the set
of all entities satisfying the axioms, and / is the set of intended models.
We shall call C the "characteristic set" of T. In the case of classical PM,
for example, C is the set of all ordered quintuples <P, T, m, 5, F> satis-
fying axioms P1-P6. Just what systems are comprised within the set /
of intended models for classical PM cannot be specified with precision,
owing to the vagueness in the physical concepts of 'particle/ 'position/
'mass,' and 'force.' Even to attempt an analysis of the intended models of
classical PM would fall outside the scope of this paper. It will turn out,
though, that such an analysis is not essential to our account of reduction,
which rests on certain assumptions about the relations between the
intended models of PM and RBM, and not on any theory as to what
those models are.
One thing which it is essential to note in connection with the intended
models of PM is that they are all 'physical systems' in an extended
sense. They must be entities which could at least conceivably satisfy the
axioms, and therefore they must be ordered-quintuples <P, T, m, 5, F>.
Roughly, then, the intended models will be systems <P, T, m, S, Fy such
that P is a set of particles (physical objects whose size, for the purposes of
the application, can be neglected, and not, for example, numbers), T is a
set of clock readings during an interval of time, m, S, and F are functions
260 ERNEST W. ADAMS
giving the results of measurements of mass, position, and forces applied to
particles of the system during the time interval. Similarly, the intended
models of RBM will be ordered septules <K, T, g, R, H, //, 0> satisfying
the descriptions in the intended interpretations.
With theories characterized as ordered-pairs, the first member of which
is its characteristic set — i.e., all entities satisfying its axioms — and the
second member of which is its set of intended models, "truth" becomes
definable in an obvious way. The theory is true if and only if all of its
intended models satisfy its axioms, otherwise it is false. If T — <C, />,
then T is true if and only if / is a subset of C.
In terms of the modified conception of theory outlined above, it is
possible to give what we hope is a more adequate explication of 'reduction'
than the one originally proposed. The 'definition' of the fundamental
concepts of the secondary theory of the reduction (in this case RBM) in
terms of those of the primary theory (PM in this case) represents, we have
argued, an empirical hypothesis. This hypothesis is one which postulates
that there is a certain connection between the intended models of the
secondary and primary theories. In the case of RBM and PM, the as-
sumption is that every rigid body is composed of particles, and that the
masses, positions, applied forces, moments of inertia, and angular po-
sitions or the rigid bodies are related to the masses, positions, and applied
forces on the particles composing them as outlined in the previous section.
This assumption is clearly about the intended interpretations of RBM
and PM and not about all entities satisfying their axioms, since there will
be members of the characteristic set of RBM which are not physical
objects at all, and hence not 'composed' of anything. Similarly, in the
reduction of thermodynamics to statistical mechanics, it is assumed that
all thermal bodies are composed of molecules, and thet tha absolute
temperature of the body is proportional to the mean kinetic energy of the
molecules composing it. This again is an assumption about the objects to
which the two theories are applied; i.e., about their intended models. In
each reduction, it is assumed that every intended model in the secondary
theory has a particular relation to some intended model of the primary
theory.
It is possible of formalize the above interpretation of the definition of
the concepts of the secondary theory in terms of those in the primary
theory as follows. Let T\ = <Ci, /i> be the primary theory, and let
T2 = <C2, /2> be the secondary theory which is reduced to TI. The
'definition' in question can be represented as a hypothesis that every
FOUNDATIONS OF RIGID BODY MECHANICS 261
intended model i% e 1% if the secondary theory has a special relation R
(which we shall call the 'reduction relation1) to an intended model i\ e I\
of the primary theory T\. Although we shall not attempt to formalize the
informal characterizations of primary and secondary theories and re-
ductions given above, we shall set down quasi-formally the basic con-
nection just stated between the intended models of the primary and
secondary theories and the reduction relation as Condition A, below.
CONDITION A. Let T\ = <Ci, /i> and T2 = <C2, /2> be two theories
such that T% is reduced to T\ by relation R. Then for all *2 in 1 2 there exists
*i in /i such that
Simply defining the intended models of the secondary theory in terms
of the intended models of the primary theory does not, of course, reduce
one theory to the other. It must also be shown that in some sense, the
laws of the secondary theory 'follow' from the laws of the primary theory
together with the definition. One way to formulate this requirement,
which avoids reference to such syntactical concepts as derivability, is as
follows: it must be the case that if any element c2 has relation R to some
element c\ which satisfies the laws of the primary theory (i.e., c\ is in Ci),
then C2 satisfies the laws of the secondary theory (c2 is in C2). This second
requirement is formulated explicitly in Condition B, below.
CONDITION B. Let T\ — <Ci, /i> and T2 = <C2, /2> be two theories such
that T2 is reduced to T\ by relation R. Then for all c\ and c%, if c\ is in C\
and c^Rci then c2 is in C2.
Conditions A and B do not, of course, define the concept of a reduction
relation. However, they do have one very important consequence: if a
theory T\ is reduced to a theory T\ by a relation R satisfying Conditions
A and B, then if T\ is correct, then T2 is correct. Thus, any reduction
relation which satisfies Conditions A and B satisfies what seems to us to
be the most essential requirement for reduction, namely: it must be pos-
sible to show that if the primary theory in the reduction is correct in that
all of its intended models satisfy its axioms, then all of the intended
models of the secondary theory satisfy its axioms, and therefore the
secondary theory is also correct. This is the core of the reduction of
thermodynamics to statistical mechanics. In this case, what is shown is
that if the laws of statistical mechanics are correct (and this may be
doubtful), and the hypothesis of the reduction is correct (which says that
every thermal body is composed of particles, and its temperature is
262 ERNEST W. ADAMS
proportional to the mean kinetic energy of the particles composing it),
then thermodynamics is correct.
Remark. As has been pointed out, Conditions A and B do not define
the concept of reduction. A complete analysis of this notion would un-
doubtedly formulate considerably more restrictive conditions than ours
on the concept. In fact, our conditions are so weak that for any two
correct theories it is possible to construct a trivial relation 'reducing' one
to the other satisfying Conditions A and B. Nagel [8] and Bergmann [2]
have discussed some further restrictions informally. However, it is worth
observing that conditions much like our A and B are central to both of
their analyses.
4. Reduction of RBM to PM. The reduction relation relating RBM and
PM can be defined by simply formalizing the descriptions of the intended
interpretations of the primitive notions of RBM in terms of the concepts
of PM, as outlined in Section 2. The precise formalization of this defi-
nition is too lengthy to be included here, and we shall only sketch its
main features. Let R be the reduction relation; it is necessary to specify
when R holds between an ordered septuple F — </£, T, g, R, II, p, Oy
and an ordered quintuple A = <P, T, m, S, F>. In the intended inter-
pretation of K, it is assumed that the elements of K are composed of
particles — i.e., that each rigid body is a set of particles. This requirement
can be formulated by imposing the condition that if F has relation R to A,
then K must be a partition of P\ i.e., the particles composing P can be
separated into sets which 'compose' the rigid bodies in K. In addition
to the requirement that K be a partition of P, it is also necessary to
impose the requirement that the particles which form a particular
element of K maintain constant mutual distances : that is, if p and q are
both elements of k, then for all t in T,
\S(p,t)-S(q,t)\
is a constant.
Not only must the rigid bodies k be composed of elements of P, but the
mass, position, force, moment of inertia, and angular position functions
of F must have the proper relations to the mass, position, and force
functions of A. For example, the mass g(k] of rigid body k must the sum
of the masses of the particles composing it. Hence, a condition in the
FOUNDATIONS OF RIGID BODY MECHANICS 263
definition of R must be that if jT has relation R to A, then for all k in K,
g(k)=^m(p).
pek
Similarly, if R(k, t) is to represent the position of the center of mass of k
at time /, it must be required that for all k in K and t in T,
pek
Similar conditions relate the remaining functions Ht // and 6 of F to the
functions m, S and F of A.
With the relation R defined, it is possible to ask whether or not it
satisfies Conditions A and B given in Section 3. Condition A requires that
every intended model of RBM has relation R to some intended model of
PM. The intended models of RBM are systems of rigid bodies, and those
of PM are systems of particles. That a system of rigid bodies has relation
R to a system of particles means that the rigid bodies in the first system
are composed of the particles in the second system, that the masses of the
rigid bodies are equal to the sums of the masses of the particles composing
them, and that the other functions of the rigid body system have the
proper relations to the mass, position, and force functions of the particle
system. Clearly whether or not Condition A is satisfied, depends on the
empirical hypothesis that all rigid bodies are composed of particles which
move about as though fixed in rigid frames, and the sum of whose masses
is equal to the mass of the body.
The determination of whether or not Condition B is satisfied does not
raise any empirical questions. In fact, it can be shown logically that if a
system A — <7J, T, m, S, F> satisfies the axioms of PM, and if T =
<7£, T, g, R, H, p., Oy has relation 7? to A, then F satisfies the axioms of
RBM. This is essentially what is proven in the usual 'derivations' of the
laws of RBM from those of PM given in text books, and this is equivalent
to a proof that Condition B is satisfied.
Hence it can be rigorously proven that whether relation R actually
gives a reduction of RBM to PM depends on whether the empirical
assumptions involved in Conditions A are correct. If they are not, then
RBM has not been reduced to PM, and the usual deduction of the laws
of RBM from those of PM is invalid. If, for example, there were a rigid
body not composed of particles, then it is clear that nothing could be
264 ERNEST W. ADAMS
deduced about its behavior from the laws of particle mechanics, since
those laws only describe the behavior of particles. 6
The empirical question here raised is a very difficult one, and involves
in addition the problem of clarifying the rather vague notion of a particle.
It may be observed that the molecular theory lends support to the hypo-
thesis that rigid bodies are composed of entities small enough to ap-
proximate the point-particles required in the derivation of the laws of
RBM, and the theory of solids indicates that these molecules remain
relatively fixed within rigid bodies. However, the facts that molecules
only approximate point-particles, and that they are not perfectly rigidly
fixed within the bodies they compose, shows that the deduction of the
laws of RBM from those of PM depends on an hypothesis which, taken
exactly, is false. The necessary revisions are, however, complicated, and
are, in any case, beyond the scope of this paper.
Bibliography
[1] ADAMS, E. W., Axiomatic Foundations of Rigid Body Mechanics. Unpublished
Ph. D. dissertation, Stanford University, 1955.
[2] BERGMANN, G., Philosophy of Science. Madison 1957, XII -f- 181 pp.
[3] Joos, G., Theoretical Physics. Translated by I. Freeman. New York 1934,
XXIII + 748 pp.
[4] McCoNNELL, A. J., Applications of the Absolute Differential Calculus. London
1931, XII H- 318pp.
[5] McKiNSEY, J. C. C., A. C. SUGAR and P. SUPPES, Axiomatic Foundations of
Classical Particle Mechanics. Journal of Rational Mechanics, vol. 2 (1953),
pp. 253-272.
[6] McKiNSEY, J. C. C. and P. SUPPES, Philosophy and the axiomatic foundations
of physics. Proceedings of the Xlth International Congress of Philosophy, vol.
VI (1953), pp. 49-53.
[7] MILNE, E. A., Vectorial Mechanics. New York 1948, XII -f 382 pp.
[8] NAGEL, E., The meaning of reduction in the natural sciences. Reprinted in
Readings in Philosophy of Science, P. P. Wiener editor, New York (1953), pp.
531-549.
6 It may be objected that the derivation of the laws of rigid motion includes the
motions of rigid bodies with continuous mass distributions. The properties of bodies
with continuous distributions are, however, derived from the laws of continuum
mechanics (see, e.g., Noll, W. [9] in this volume), which is not a branch, but an
extension of particle mechanics.
FOUNDATIONS OF RIGID BODY MECHANICS 265
[9] NOLL, W. , The foundations of classical mechanics in the light of recent advances
in continuum mechanics. This volume, pp. 226-281.
[10] RUBIN, H. and P. SUPPES, Transformations of systems of relativistic particle
mechanics. Pacific Journal of Mathematics, vol. 4 (1954), pp. 563-601.
[1 1] TARSKI, A., The Concept of Truth in Formalized Languages. In Logic, Seman-
tics, Metamathcmatics by A. Tarski, translated by J. H. Woodger. Oxford
1956, pp. 152-278.
[12] WHITTAKER, E. T., A Treatise on the Analytical Dynamics of Particles and
Rigid Bodies, 4th ed. New York 1944, XIV + 456 pp.
Symposium on the Axiomatic Method
THE FOUNDATIONS OF CLASSICAL MECHANICS
IN THE LIGHT OF RECENT ADVANCES
IN CONTINUUM MECHANICS 1
WALTER NOLL
Carnegie Institute of Technology, Pittsburgh, Pennsylvania, U.S.A.
1. Introduction. It is a widespread belief even today that classical
mechanics is a dead subject, that its foundations were made clear long
ago, and that all that remains to be done is to solve special problems. This
is not so. It is true that the mechanics of systems of a finite number of
mass points has been on a sufficiently rigorous basis since Newton. Many
textbooks on theoretical mechanics dismiss continuous bodies with the
remark that they can be regarded as the limiting case of a particle system
with an increasing number of particles. They cannot. The erroneous belief
that they can had the unfortunate effect that no serious attempt was
made for a long period to put classical continuum mechanics on a rigorous
axiomatic basis. Only the recent advances in the theory of materials
other than perfect fluids and linearly elastic solids have revived the interest
in the foundations of classical mechanics. A clarification of these foun-
dations is of importance also for the following reason. It is known that
continuous matter is really made up of elementary particles. The basic
laws governing the elementary particles are those of quantum mechanics.
The science that provides the link between these basic laws and the laws
describing the behavior of gross matter is statistical mechanics. At the
present time this link is quite weak, partly because the mathematical
difficulties are formidable, and partly because the basic laws themselves
are not yet completely clear. A rigorous theory of continuum mechanics
would give, at least some precise information on what kind of gross
behavior the basic laws ought to predict.
I want to give here a brief outline of an axiomatic scheme for continuum
mechanics, and I shall attempt to introduce the same level of rigor and
clarity as is now customary in pure mathematics. The mathematical
1 The results presented in this paper were obtained in the course of research
sponsored by the U.S. Air Force Office of Scientific Research under contract no.
AF 18 (600)-1138 with Carnegie Institute of Technology.
266
FOUNDATIONS OF CONTINUUM MECHANICS 267
structures involved are quite complex, and some fine details have to be
emitted in order not to overburden the paper with technicalities.
Notation: Points and vectors in Euclidean space will be indicated
by bold face letters. If x and y are two points, then x — y denotes the
sector determined by the ordered pair (y, x). If x is a point and v a vector,
then x + v denotes the point uniquely determined by (x + v) — x = v.
The word "smooth" will be used instead of "continuously cliff crentiable".
Some equations will be valid only up to a set of measure zero. It will be
~lear from the context when this is the case.
2. Bodies.
DEFINITION 1 : A BODY is a set 93 endowed with a structure defined by
(a) a set 0 of mappings of 93 into a three-dimensional Euclidean point
space E, and
(b) a real valued set function m defined for a set of subsets of 93
subject to seven axioms as follows :
(5.1) Every mapping <p o 0 is one-to-one.
(5.2) For each 9? e 0, the image B = <p(93) ts a region in the space E, a
region being defined as a compact set with piecewise smooth boundaries.
(5.3) // <p e 0 and y e 0 then the mapping % = \p ° (p 2 Of y;(93) onto
^(93) can be extended to a smooth homeomorphism of E onto itself.
(5.4) // y is a smooth homeomorphism of E onto itself and if y e 0, then
also £ " 99 e 0.
These four axioms give 93 the structure of a piece of a diffcrentiable
Tianifold that is isomorphic to a region in Euclidean three-space. The
following three axioms give 93 the structure of a measure space.
(M.I) m is a non-negative measure, defined for all Borel subsets (£ of 93.
_ -j
(M.2) For each y e 0, the measure fiv = m ° (p induced by m on the region
B — (^(93) in space is absolutely continuous relative to the Lebesgue
measure in B. Hence it has a density p9 so that
(2.1) m(C) = /p,
(M.3) For each (p E 0 the density p9 is positive and bounded.
2 The symbol o denotes the composition of mappings and a superposed — 1
ienotes the inverse of a mapping.
268 WALTER NOLL
We use the following terminology : The elements X, Y, ... of 93 are the
PARTICLES of the body. The mappings y e 0 are the CONFIGURATIONS of
the body. The point x = <p(X] is the POSITION of the particle X in the
configuration <p. The set function m is the MASS DISTRIBUTION of the body.
The number m(&) is the MASS of the set (£. Here and subsequently we
refer to Borel sets simply as sets. The density pv is the MASS DENSITY of 93
in the configuration (p. Note that it would have been sufficient to require
the existence of p9 only for one particular configuration (p. It then follows
that the mass density exists also for all other configurations.
A compact subset $ of 93 with piecewise smooth boundaries will be
called a PART of the body 93. It may again be regarded as a body whose
configurations are the restrictions to $ of the configurations of 93 and
whose mass distribution is the restriction of the mass distribution of 93
to the subsets of $. Two parts $ and D will be called SEPARATE if
$ n €t C «jj * &,
where s$ denotes the boundary of s$.
3. Kinematics
DEFINITION 2: A MOTION of a body 93 is a one-parameter family {Ot},
— oo < t < oo, of configurations Ot e 0 of 93 such that
(/C.I) The derivative
(3.1) v(X,t)=-^Ot(X)
exists for all X e 93 and all t, it is a continuous function of X and t
jointly, and it is a smooth function of X.
(K.2) The derivative
(3.2) v(X. t)=^. v(X, t)=^ Ot(X)
exists piecewise and is piecewise continuous in X and t jointly.
The parameter t is called the TIME. Derivatives with respect to t will be
denoted by superposed dots. v(X, t) is called the VELOCITY of the particle
X at time t. v(Xt t) is called the ACCELERATION of X at t.
Let h be any real, vector, or tensor valued function of X and tt and
assume that h(X, t) is smooth in X and t jointly. We may then associate
FOUNDATIONS OF CONTINUUM MECHANICS 269
with h the function fi defined by
(3.3) K(*,t)=h(e](*),t)
for — oo < t < oo and x e 0^(93). By the chain rule of differentiation we
have
(3.4) h(X, t) = &(6t(X), t) + Pk(Ot(X), t)-v(X, t),
where Vn denotes the gradient of h with respect to x. It is customary in
the literature to use the same symbol for h and h, to omit the independent
variables, and to distinguish h from h by writing h — — . Equation (3.4)
then takes the familiar form
(3.5) h = ^+v.gr*dh.
The LINEAR MOMENTUM at time t of a set (£ C 93 is defined by
(3.6)
It follows from (K.I) and (K.2) that g((£, t) is piccewise smooth in t. As a
function of E it is a vector valued measure, absolutely continuous relative
to m with density i>.
The ANGULAR MOMENTUM at time t of a set & C 93, relative to a point
O 6 E, is defined by
(3.7) fc(g; t\ O) =f[Ot(X) - O] x v(X, t)dm.
(£
It is piccewise smooth in t, and, as a function of (£, it is a vector valued
measure.
4. Forces
DEFINITION 3: A SYSTEM OF BODY FORCES for a body S3 is a family
{B^} of vector valued set functions subject to the following axioms :
(B.I) For each part ty of 93, B^ is a vector valued measure defined on the
Borel subsets of ty.
(B.2) For each *$, B<% is absolutely continuous relative to the mass distri-
bution m of ^5. Hence it has a density b^ so that
(4.1) BSJJ(e)
270 WALTER NOLL
(B.3) The density b^ is bounded, i.e.
\b#(X)\ <k <oo,
where k is independent of ^5 and X E <$.
DEFINITION 4: A SYSTEM OF CONTACT FORCES for a body 93 is a family
{C^} of vector valued set functions subject to the following axioms :
(C.I) For each part ^ of 93, C^ is a vector valued measure defined on the
Borel subsets of s$.
(C.2) C^(£)_-C^(S^).
(C.3) // c C a c C $, and $ C O, then
C*(c) = CG(c).
(C.4) If (p e0 is any configuration of 93 0w^ if P — <p(s$), /Atfw the induced
measure C^ o qpt when restricted to the Borel subsets of the boundary
surface P of P = <?(?$), is absolutely continuous relative to the Le-
bcsgue surface measure on P. Hence it has a density s(*$, (p) so that
(4.2)
for all Borel siibsets c C ^.
(C.5) The density $(*$, q>) is bounded, i.e.
$(*$, <p\x)\ < / < oo,
where I does not depend on $ or x e
As in the case of a mass distribution, it would, have been sufficient in
(C.4) to require the existence of s(^5, (p) only for a particular <p e 0. The
existence of s for all other configurations is then an automatic consequen-
ce. The axiom (C.2) means that C^ is essentially a vector measure on the
boundary $•
It is useful to consider surfaces in 93 as being oriented, and to employ
the operation of addition of oriented surfaces in the sense of algebraic
topology. The boundary ^5 of a part *(J of 95 will be regarded as oriented
in such a way that the positive side of s$ is exterior to $. If ^5 and jQ
are two separate parts of 93, then
(4.3) $ w O = % + a
This is. true because the common boundary of ^J and d, if any, appears
FOUNDATIONS OF CONTINUUM MECHANICS 271
twice with opposite orientation on the right side of (4.3) and hence
cancels. We shall say that the surface c is a PIECE of the surface b if c is a
subset of b and if the orientation of c is induced by b. The significance of
the axiom (C.3) is brought out by the following theorem:
THEOREM I : There is a vector valued junction S, defined for all oriented
surfaces c in 93, such that
(4.4) tyc) = S(c)
whenever c is a piece of the boundary *$ of $. We say that £(<:) is the CONTACT
FORCE ACTING ACROSS THE ORIENTED SURFACE C.
Proof: For each c which is not a piece of — S3 we can find a part
£t(c) of $ such that c is a piece of &(c). We then define 5(c) = CO(c)(c).
Now let s$ be an arbitrary part of 93 and let c be a piece of s$. We then
have
cC$, cCO(cj, c C d(c) ^ $,
¥^Q(c)C$,
Applying axiom (C.3) twice, we get
Hence
= C0(e)(c) = S(t).
If C is a part of — S3 we define
(4.5) 5(c) = - S(- c).
It follows from theorem I and axiom (C.4) that there is a vector valued
function $(c, <p; x) such that
(4.6) S(c) =fs(ct<p;
T(C)
Also, if x 6 <p(b) C <p(c) and if b is a piece of c, then
(4.7) «s(c, y; x) = s(b, y,x).
If Ci and 02 are two pieces of a surface c and if c = Ci + C2, then
(4.8) S(c) - 5(ci) + 5(c2).
This is true because Cs^, as a measure, is additive and because, by axiom
(C.4) the value of C^ for the common boundary curve of Ci and 02 is zero.
272 WALTER NOLL
DEFINITION 5: A SYSTEM OF FORCES for a body 93 is a family of vector
valued measures {F^} such that, for each part $ of 93, F^ is defined on the
subsets of ty and such that the F^ have decompositions
(4.9) Fw = By + Cy,
where {B^} is a system of body forces and {C<$} is a system of contact forces.
It is not hard to sec that the decomposition (4.9), if it exists, is auto-
matically unique.
We use the following terminology: The measure F^ is the FORCE acting
on the part *$ of 93. The vector F^SJJ) is the RESULTANT FORCE acting on
$. Let $ and jQ be two separate parts of 93. The vector measure
(4-10) *».c = *» - **uc
defined on the subsets of s$, is the MUTUAL FORCE exerted on ^ by C,.
The mutual force exerted on a part *JS of 93 by the closure of its comple-
ment is denoted by F(^ and it is called the INTERNAL FORCE acting on ^.
The restriction of F% to a part ^J of 93 is the EXTERNAL FORCE acting on *JJ.
A similar terminology and notation will be used when "force" is replaced
by "body force" or by "contact force".
Let {FqJ be a system of forces for a body 93, y e 0 a configuration of $,
and O E E a point in space. The MOMENT about O of the force F^ acting
on the part s$ of 93 in the configuration <p is the vector valued measure
y, O) defined by
(4. 1 1) M(F#, ?, O ;<£)=/ MX) - O] X dF,p
<$
for the subsets E of ty. The vector M(F^, 9?, O; s$) is the RESULTANT
MOMENT about O acting on ^J.
5. Dynamical processes
DEFINITION 6: A DYNAMICAL PROCESS is a triple {93, Ot, F^v}, where 93
is a body, Ot is a motion of 93, and F^it is a one-parameter family of systems
of forces for 93, subject to the following two axioms'.
(D.I) Principle of linear momentum'. For all parts ^5 of 93 and all times t,
where g is defined by (3.6). In words: The resultant force acting on
the part ^J is equal to the rate of change of the linear momentum of ^J.
(D.2) Principle of angular momentum: Let O G E be any point in space.
FOUNDATIONS OF CONTINUUM MECHANICS 273
Then for all parts *J5 of 93 and all times t,
(5.2) Af(Fw, Ot, O; $) = fc($; *; O)f
where h and M are defined by (3.7) and (4. 1 1 ), respectively. In words :
The resultant moment about O acting on a part ty is equal to the rate
of change of the angular momentum of ty relative to O.
It would have been sufficient to require that (5.2) holds for a particular
O e E. It is then automatically valid for all points in space. Also, (5.2)
remains valid if the fixed point O is replaced by the variable mass center
(5.3) c($, t) = O + ~i- f (Ot(X) - 0)dm
m j
of the part *(J. These statements can be proved in the classical manner.
We now prove a number of important theorems. For simplicity we
omit the variable t', we write
(5.4) *(c;«) = s(t,0t',x\
for the density of the contact force as defined by (4.6).
THEOREM II : For any two separate parts $ and d of 93 we have
(5.5) *».o(*) = - fo.*(&)
i.e. the resultant mutual force exerted on *J3 by d is equal and opposite to the
resultant mutual force exerted on d by *JJ.
Proof: We apply axiom (D.I) to $, £}, and $ w O:
(5.6) FyW) = g(%), F0(0) = g(Q), F^u0(* w O) =
Since ^J r» Q, has no mass by (M.2), it follows from (3.6) that
hence, by (5.6),
It is not hard to see that F¥uO(?P <-. £}) = 0. Hence
^uot* - O) = FWo0($) + ^U0(
The assertion follows now from the definition (4.10).
274 WALTER NOLL
THEOREM III (reaction principle) 3: The contact force 5(c) acting cross
C is opposite to the contact force acting across — c, i.e.
(5.7) S(c) = - S(- c)
Proof: If c is a piece of — 33, then (5.7) is true by the definition (4.5).
If not, it is possible to find two separate parts $ and d such that
$ r> D = c (see Fig. 1). We orient c such that it is a piece of *$. Then — c
K.
r y
will be a piece of O. The surfaces ^5, O, and ^5 ^ d have decompositions
^ = c + b, JQ = (— c) + e, $^d = b + e.
It follows from theorem I and (4.8) that
and hence that
<W¥) = C,(¥) - C¥u0(*) = 5(c).
Similiarly, we obtain
For the total resultant mutual forces, we get
. (- c).
Application of theorem II gives
(5.9) 5(c) + 5(- c) = -[Bw.0(¥)
Using axiom (M.3) one can show that the parts ^ and d can be chosen
3 Various statements, mostly quite vague, pass under the title "principle of
action and reaction" in the literature. All of these statements, when made precise,
are provable theorems in the theory presented here.
FOUNDATIONS OF CONTINUUM MECHANICS 275
such that their masses w(*J5) and w(&) are arbitrarily small. Axiom (B.3)
then implies that the right side of (5.9) can be made arbitrarily small in
absolute value. It follows that the left side of (5.9) must vanish. Q.e.d.
As a corollary, it follows that
(5.10) 5(ci + C2) = 5(ci) + 5(ca),
no matter whether Ci and C2 are pieces of c = Ci + C2, as in (4.8), or not.
Hence S may be regarded as an additive vector valued function of
oriented surfaces in 93. Another corollary is that the statement of theorem
II remains true if "mutual force*' there is replaced by "mutual contact
force" or by "mutual body force".
THEOREM IV (stress principle) 4 : There is a vector valued function
s(x, ri), where x e 0$(93) and where n is a unit vector, such that
(5.11) *(c;*) =s(x,n)
whenever 0*(c) has the unit normal n at x e 0$(c), directed towards the positive
side of the oriented surface 0$(c), the orientation of 0$(c) being induced by the
orientation of c.
Proof: Let Ci and C2 be two surfaces in 93 tangent to each other at
-i
X = Ot(x). The surfaces c\ = 0$(Ci) and C2 = 0j(C2) in space E are then
tangent to each other at the point x. We assume that n is their unit
normal at x and that Ci and C2 are oriented in such a way that n is directed
Fig. 2
toward the positive side of c\ and c%. Consider the region PI bounded by a
piece di of ci, a piece of a circular cylinder / of radius r whose axis is n
4 The assertion of this theorem appears in all of the past literature as an assump-
tion. It has been proposed occasionally that one should weaken this assumption and
allow the stress to depend not only on the tangent plane at x, but also on the curva-
ture of the surface c at x. The theorem given here shows that such dependence on
the curvature, or on any other local property of the surface at x, is impossible.
276 WALTER NOLL
and by a plane perpendicular to n at a distance r from x as shown in
Fig. 2. The region P% is defined in a similiar manner. Denote the common
boundary of PI and P% on the cylinder and the plane by e. The bounda-
ries PI and P% then have decompositions into separate pieces of the form
(5.12) Pi = <*i + * + /!, P2 = d2 + e + f2,
where f\ and /2 are pieces of the cylinder /. We denote the surface area of a
surface c by A(c) and the volume of a region P by V(P). It is not hard to
see that then
(5.13)
(5.14)
(5.15)
-i -i
for * = 1, 2. $1 = Ot(Pi) and $2 = Ot(P2) will be parts of 93 for small
enough r, except when xe 0$(93), and n is directed toward the interior
of 0$(93). Applying axiom (D.I) to ty\ and s$2 gives
(5.16)
*45i
By (4. 1 ) and (4.4) this may be written in the form
(5.17) Sfa) = f (v - b^&m.
s£<
By (4.6), (4.7), (4.8), and (5.12) we have
(5.18)
-1 -1
where f< = 0«(A)» c = 0«(^). Subtracting the two equations (5.18) and
using (5.17), we get
(5.19)
Since v, b^t and the mass density are bounded by constants independent
of ^S, according to the axioms (K.2), (B.3), and (M.3), it follows from
(5.15) that
/(v-ftl)dfw = o(r»), t = 1,2.
FOUNDATIONS OF CONTINUUM MECHANICS 277
Similarly, it follows from axiom (C.5) and from (5.14) that
/*(fi)d4 = o(r2), i= 1,2.
Hence, by (5.19),
Ji s(t)dA =f s(c2)dA + o(r2).
di da
Dividing by nr2 and using (5.13), we get
/5 on) * - -* 4- °(r}
^ ' ' A(di) ' A(d2) ^ nr*
By a theorem on measures with density, we have
lim-^— --— = *(c«; *)f *=lf2.
r+O A (di)
Thus, letting r go to zero in (5.20), we finally obtain
which shows that s(c ; x) has the same value for all surfaces c with the
-i _
same unit normal n. The exceptional case when x e 0*(93) and n is
directed toward the interior of 0^(83) is taken care of by the definition
(4.5).
The vector s(xt n) is called the STRESS acting at x across the surface
element with unit normal n. By (4.6) the contact force 5(c) acting across
C is given by
(5.21) 5(c) =fs(x,n)dA,
ft(c)
where n is the unit normal at x to the oriented surface 0j(c). By theorem
II we have
(5.22) s(x, n) = — s(x, — n).
The following two additional assumptions suffice to ensure the validity
of the classical theorems of continuum mechanics:
(a) The stress s(x, n), for each n, is a smooth function of x e 0*($8).
278 WALTER NOLL
(b) For almost all X e 93, the limit
(5.23)
where *JJ is a neighborhood of X contracting to X, exists.
Under these assumptions, one can prove the following theorems in
the classical manner:
(1) There is a field of linear transformations S(x), x e 0$(93), such that
(5.24) s(x, n) = S(x)n.
S(x) is called the STRESS TENSOR at x.
(2) The stress tensor S(x) is symmetric.
(3) Cauchy's equation of motion
(5.25) div 5 + pb = pv
holds, where 5 is the stress tensor, p is the mass density, v is the
acceleration, and b is defined by (5.23).
6. Equivalence of dynamical processes. The position of a particle can be
specified physically not in an absolute sense but only relative to a given
frame of reference. Such a frame is a set of objects whose mutual distances
change very little in time, like the walls of a laboratory, the fixed stars,
or the wooden horses on a merry-go-round. In classical physics, a change
of frame corresponds to a transformation of space and time which pre-
serves distances and time intervals. It is well known that the most
general such transformation is of the form
** = c(t) + Q(t)(x - O),
(6.1)
' t*=t + a,
where c(t) is a point valued function of t, Q(t) is a function of t whose
values are orthogonal transformations, a is a real constant, and O is a
point, which may be fixed once and for all. We assume that c(t] and Q(t)
are twice continuously differentiable. A change of frame (6. 1) also induces
a transformation on vectors and tensors. A vector u, for example, is
transformed into
(6.2) u* = Q(t)u.
Let {93, Ot, Fysti$ be a dynamical process. A change of frame [c, Q, a}
FOUNDATIONS OF CONTINUUM MECHANICS 279
will transform the motion Ot into a new motion Ot' defined by
(6.3) Otf(X) = c(t -a) + Q(t - a)[Ot-a(X) - O].
The velocities and the accelerations of the two motions Ot and Ot are, in
general, not related by the transformation formula (6.2) for vectors. They
depend on the choice of the frame of reference. We say that they are not
objective. However, there are objective kinematical quantities, for ex-
ample the rate of deformation tensor.
If we wish to assume that forces have an objective meaning we would
have to require that F<^((£) transforms according to the law (6.2) under
a change of frame. However, when this assumption is made, a dynamical
process does not transform into a dynamical process because the axioms
(D.I) and (D.2) are not preserved, except when c is linear in / and Q is
constant. It is this difficulty which has led to the concept of absolute
space and which has caused much controversy in the history of mechanics.
A clarification was finally given by Einstein in his general theory of
relativity, in which gravitational forces and inertial forces cannot be
separated from each other in an objective manner. If we wish to stay in
the realm of classical mechanics we may resolve the paradox by sacrificing
the objectivity of the external bod)7 forces while retaining the objectivity
of the essential types of forces, the contact forces and the mutual body
forces. This can be done by assuming that the forces transform according
to a law of the form
(6.4) F'W(C) = Q(t - a)Fw_a((E) + !(<£, t).
Here /(g, t) will be called the INERTIAL FORCE acting on g due to the
change of frame {c, Q, a}.
DEFINITION 7 : Two dynamical processes (93, Ot, F^J and (93, Ot', F'^,}
are called EQUIVALENT if there is a change of frame {c, Q, a} such that Ot' and
F'yj are related to Ot and F^tt by (6.3) and (6.4).
The classical analysis of relative motion shows that the inertial force
/((£, t) is necessarily of the form
(6.5) /(«,/) =fi(X,t)dm
(£
with
(6.6) i(X, t) = c(t -a}+ 2V(t - a)[v'(X, t) - c(t - a)]
+ [V*(t - a) - V (t - a)][Otf(X) - c(t - a)]
280 WALTER NOLL
where v' is the velocity of the motion 0/, and where V(t) is defined by
(6.7) V(t) = Q(t)Q(t)-\
It is not hard to see that the inertial force / gives a contribution only to
the external body forces and that the contact forces and the mutual
forces transform according to (6.2) and hence are objective. The external
body forces and the inertial forces cannot be separated from each other in
an objective manner. Experience shows that, for the body consisting
of the entire solar system, there are frames relative to which the external
body forces nearly vanish. These are the classical Galilean frames. Two
equivalent dynamical processes really correspond to the same physical
process, viewed only from two different frames of reference.
7. Constitutive assumptions. An axiom that characterizes the particular
material properties of a body is called a CONSTITUTIVE ASSUMPTION. It
restricts the class of dynamical processes the body can undergo. A
familiar example is the assumption that the body is rigid. It restricts the
possible motions to those in which the distance between any two particles
remains unchanged in time. More important for modern continuum
mechanics are constitutive assumptions in the form of functional re-
lations between the stress tensor S and the motion Ot. Such relations are
called CONSTITUTIVE EQUATIONS (sometimes also rheological equations of
state or stress-strain relations). A classical example is the constitutive
equation for linear viscous fluids
(7.1) S = (— £ + A tr D)I + 2/*D,
where D is the rate of deformation tensor, / is the unit tensor, p is the
pressure, and A and fi are viscosity constants. A wide variety of consti-
tutive equations have been investigated in recent years 5, and a general
theory of such equations has been developed [2].
Constitutive assumptions are subject to a general restriction:
PRINCIPLE OF OBJECTIVITY: // a dynamical process is compatible with
a constitutive assumption then all processes equivalent to it must also be
compatible with this constitutive assumption. In other words, constitutive
assumptions must be invariant under changes of frame.
This principle, although implicitly used by many scientists in the his-
tory of mechanics, was stated explicitly first by Oldroyd [3] and was
5 A review of the literature and a bibliography is given in [1].
FOUNDATIONS OF CONTINUUM MECHANICS 281
clarified further by the author [4]. It is of great importance in the theory
of constitutive equations.
8. Unsolved problems. The axiomatic treatment given here is still too
special. It does not cover concentrated forces, contact couples and body
couples, sliding, impact, rupture, and other discontinuities, singularities,
and degeneracies. It would be desirable to have a universal scheme which
covers any conceivable situation.
A more fundamental physical problem is to find a rigorous unified
theory of continuum mechanics and thermodynamics. Classical thermo-
dynamics deals only with equilibrium states and hence is not adequate tor
processes with fast changes of state in time. Such a unified theory should
lead to further restrictive conditions on the form of constitutive equations
and hence to more definite and realistic theories for special materials.
Also, a satisfactory connection with statistical mechanics can be expected
only after such a theory has been developed.
References
[1] NOLL, W., ERICKSEN, J. L. and TRUESDELL, C.( The Non-linear Field Theories
of Mechanics. Article to appear in the Encyclopedia of Physics.
[2] , A general theory of constitutive equations. To appear in Archive for Ration-
al Mechanics and Analysis.
[3] OLDROYD, J. G., On the formulation of rheological equations of state. Proceedings
of the Royal Society of London (A) 200 (1950), 523-541.
[4] NOLL, W., On the Continuity of the Solid and Fluid States. Journal of Rational
Mechanics and Analysis 4 (1955), 3-81.
Symposium on the Axiomatic Method
ZUR AXIOMATISIERUNG DER MECHANIK
HANS HERMES
Universitat Munster, Munster in Westfalen, Deutschland
1. Seit Newton 1st die Mechanik mehrfach axiomatisiert worden. Es
sind Axiome gegeben worden fur die Mechanik der Massenpunkte und fur
die Mechanik der Kontinua, fur die nicht-relativistische und fur die rela-
tivist ische Mechanik.
Fur die vorliegende Betrachtung sollen diese Unterschiede kein Rolle
spielen. Wir wollen uns vielmehr dafur interessieren, welcher Art die
Grundbegriffe sind, die in den Axiomensystemen auftreten. Die meisten
Axiomatisierungen verwenden u.a. kinematische Grundbegriffe, wie die
Begriffe des Ortes, der Geschwindigkeit oder der Beschleunigung. Dabei
bezieht man sich entweder auf ein festes System, oder man lasst eine
Klasse von Bezugssystemen zu, wobei der Uebergang zwischen den cin-
zelnen zugelassenen Bezugssystemen vermittelt wird durch Galilei- bzw.
Lorentztransf ormationen .
Auf die Moglichkeit, kinematische Grundbegriffe zu vermeiden, indem
man sie mit Hilfe von Definitionen auf solche zuriickfuhrt, die epistemolo-
gisch vorangehen, soil hier nicht eingegangen werden.
In den meisten Axiomensystemen (z.B. bei McKinsey, Sugar und
Suppes [3]) findet man aber nicht nur rein kinematische Grundbegriffe,
sondern es treten in ihnen Grundbegriffe auf, wie die Begriffe des Masse,
des Impulses oder der Kraft. Diese Begriffe muss man als typische dyna-
mische oder eigentlich mechanische Begriffe ansehen. Es gibt aber auch
Axiomensystemc (z.B. Hermes [2]), welche ausschliesslich mit kinema-
tischen Grundbegriffen auskommen.
Wenn man diese beiden Moglichkeiten ins Auge fasst, wird man sich
fragen, welche Gesichtspunkte man anfuhren kann, die zugunsten der
einen oder der anderen Moglichkeit sprechen. Da muss zunachst hervor-
gehoben werden, dass ein Axiomensystem, welches nicht nur kinema-
tische Grundbegriffe verwendet, viel einfacher ist, als ein Axiomensystem,
welches nur kinematische Grundbegriffe enthalt. Vom Standpunkt der
formalen Eleganz aus werden daher Axiomensysteme stets vorzuziehen
sein, die z.B. den Massenbegriff als Grundbegriff enthalten. Ein anderer
282
ZUR AXIOMATISIERUNG DER MECHANIK 283
Grund, der zugunsten solcher Axiomensysteme spricht, wird am Schluss
dieser Nummer genannt.
Gegen die Verwendung des Massenbegriffes und ahnlicher Begriffe
als Grundbegriffe in einem Axiomensystem der Mechanik spricht die fol-
gende Ueberlegung : Fur einen Physiker sind die Massen, Impulse oder
Krafte nicht unmittelbar gegeben. Er muss diese Grossen vielmehr durch
Messungen bestimmen. Eine solche Messung besteht aber letzten Endes in
einer Reduktion auf kinematische Begriffe. Wenn man etwa eine Masse
mittels einer Federwaage bestimmt, macht man eine Ortsmessung; be-
stimmt man sie mit Hilfe von Stossgesetzen, so misst man Geschwindig-
keiten ; oder aber man bedient sich des dritten Newtonschen Axioms und
stellt Beschleunigungen fest. Man kann nun den Wunsch haben, der Tat-
sache, dass ein Physiker auf solche Weise mechanische Grossen mit Hilfe
kinematischer Messungen ermittelt, in einer Axiomatisierung dadurch
Rechnung zu tragen, dass man den Begriff der Masse und verwandte
Begriffe durch Definitionen auf kinematische Begriffe zuruckfuhrt, die
den physikalischcn Messmoglichkeiten entsprechen.
Bei der Bestimmung mechanischer Grossen durch kinematische Mes-
sungen muss man die Giiltigkeit des einen oder des anderen physikalischen
Gesetzes voraussetzen. Bei der Bestimmung der Masse z.B. kann dies das
Stossgesetz oder das Gravitationsgesetz sein (vgl. [1]). Eine entsprechen-
de Definition der Masse muss sich der jeweiligen physikalischen Hypo-
these bedienen. Je nachdem, welche Hypothese man in die Definition der
Masse hineinsteckt, kommt man zu verschiedenen und primar unver-
gleichbaren Theorien der Mechanik. Man sollte dies klar zum Ausdruck
bringen und deutlich verschiedene Mechaniken unterscheiden, genau so,
wie man sich seit langerem daran gewohnt hat, von verschiedenen Geo-
metrien zu sprechen.
Jede solche Mechanik ist natiirlich eine Idealisierung. Sie ist es ins-
besondere in folgender Hinsicht : Ein Physiker wird sich bei seinen Mes-
sungen natiirlich nicht darauf beschranken, z.B. die Masse ausschliesslich
mit Hilfe eines einzigen physikalischen Gesetzes zu bestimmen; er wird
sich vielmehr vorbehalten, je nach den Umstanden das geeignetste Gesetz
zu wahlen. Eine Axiomatisierung der Mechanik, welche die Masse auf
Grund einer einzigen physikalischen Hypothese definiert, bevorzugt
dieses Gesetz in einem besonderen Masse. Man wird sich umso eher mit
einer solchen Bevorzugung befreunden konnen, je grundlegender das
Gesetz ist, auf welches dabei zuriickgegriffen wird.
Die Tatsache, dass man nicht ohne weiteres geneigt sein wird, bei einer
284 HANS HERMES
Definition z.B. der Masse ein bestimmtes physikalisches Gesetz zu be-
vorzugen, mag dazu beigetragen haben, dass viele Autoren es vorziehen,
die Masse und andere nicht-kinematische Begriffe bei einer Axiomatisie-
rung der Mechanik als Grundbegriffe zu verwenden.
2. Im folgenden soil berichtet werden iiber den in [2] unternommenen
Versuch, die Mechanik zu axiomatisieren unter Verwendung rein kinema-
tischer Grundbegriffe. Dabei wird zur Definition der Masse zuriickgegrif-
fen auf das grundlegende Gesetz der Erhaltung des Impulses bei un-
elastischen Zusammenstossen. Gleichzeitig sollen hier einige Unvoll-
kommenheiten beseitigt werden, auf welche B. Rosser in seinem Referat
[4] hingewiesen hat (vgl. auch die Korrekturen im Anhang). In der ge-
nanntcn Abhandlung [2] ist die relativistiche Kontinuumsmechanik auf-
gebaut worden. Da es im folgenden nur auf die Grundgedanken ankommt,
soil hier die nicht-relativistische Punktmechanik axiomatisiert werden,
was im Einzelnen weseritlich einfacher ist.
Zunachst einige Vorbemerkungen zur Symbolisierung. Bei der Axioma-
tisierung wird eine Theorie der rcllen Zahlen vorausgesetzt. Die eigent-
lichen mechanischen Aussagen werden in einer Stufenlogik wiedergegcben,
wobei auf der untersten Stufe zwei Sorten von Individuenvariablen ver-
wendet werden. Die Individuenvariablen TI, r<2, . . . beziehen sich auf
reelle Zahlen, die Individuenvariablen x, y, ... auf momentanc Masscn-
punkte. Ein momentaner Massenpunkt ist ein zu einem bestimmten Zeit-
punkt betrachteter Massenpunkt, also ein zeitlicher Schnitt durch die
Weltlinie eines Massenpunktes. Man kann auf die explizite Verwendung
des Begriffs eines Massenpunktes verzichten, wenn man einen Massen-
punkt auffasst als eine grosste Klasse ,,zusammengehoriger" momen-
taner Massenpunkte. Die Zusammengehorigkeit momentaner Massen-
punkte bedeutet ihre Zugehorigkeit zu einem und demselben Massen-
punkt. Zusammengehorige momentane Massenpunkte sollen nach Levin
%enidentisch genannt werden.
Orte und Zeiten momentaner Massenpunkte werden durch Bezugs-
systeme festgelegt. Ein Bezugssystem H, wie es in der Mechanik verwendet
wird, kann aufgefasst werden als eine funfstellige Relation zwischen reel-
len Zahlen TI, T2, TS, T4 und momentanen Massenpunkten x. Lrir^ryr^c
besagt, dass % im Bezugssystem E die Raumkoordinaten TI, T2, TS und die
Zeitkoordinate T4 besitzt. Haufig wird fur Zrir^r^r^x die abkiirzende Be-
zeichnung £ITX verwendet.
Das Axiomensystem enthalt zwei Grundbegriffe, namlich die zwei-
ZUR AXIOMATISIERUNG DER MECHANIK 285
stcllige Relation G und die Klasse 83. Gxy soil bedeuten, dass die momen-
tanen Massenpunkte x und y genidentisch sind, d.h., dass sie zu dem-
selben Massenpunkt gehoren. 9327 besage, dass E ein galileisches Bezugs-
system (Inertialsystem) 1st.
3. Nach diesen Vorbereitungen sollen die Axiome formuliert werden.
Der Einfachheit halber werden die Axiome angegeben unter Verwendung
der Konvention, dass frei vorkommende Variablen r\, TZ, . . ., x, y, . . . ,
Z, ... generalisiert gedacht werden.
Das erste Axiom sagt aus, dass G eine Aequivalenzrelation ist :
AXIOM 1.1. Gxx
AXIOM 1.2. Gxy ->Gyx
AXIOM 1 .3. Gxy A Gyz -> Gxz
Axiom 2.1 bring! zum Ausdruck, dass die Koordinaten eincs momen-
tanen Massenpunkt es x in jedem Bezugssystem Z cindeutig festgelegt
sind. Axiom 2.2 besagt, dass genidentische momentane Massenpunkte
x, y identisch sind, wenn sie in einem Bezugssystem £ diesclbe Zeit-
koordinate bcsitzen. In Axiom 2.3 wird gefordert, dass ein Massenpunkt
eine ,,unendliche Lebensdauer" besitzt.
AXIOM 2.1. $$Z A Zt\r\X A Zt2T2# -> tl — t2 A TI — T2
AXIOM 2.2. 93£ A G%\%^ A ZVITXI A Zi^rx^ -> x\ = x%
AXIOM 2.3. 33r -> VV (Gxy A 27try)
r ?/
Der Zusammenhang zwischcn den verschiedenen Koordinatensystcmen
wird hergestcllt mittels der sog. Galileitransformationen. gal F bedeute,
dass F eine Galileitransformation ist. Der Begriff der Galileitransforma-
tion ist bekannt. Es handelt sich bei jcder solchen Transformation um
einen speziellen Automorphismus des gesamten reellen vierdimensionalen
Raumes. Wenn man die Koordinaten aller momentancn materiellcn
Punkte in einem Inertialsystem 2 einer derartigen Galileitransformation
r unterwirft, so erhiilt man neue Koordinaten. Dass diese Zuordnung
wieder ein Inertialsystem ist, wird in Axiom 3.1 gefordert. Dass je zwei
Inertialsysteme in dieser Weise zusammenhangen, wird in Axiom 3.2
ausgesagt. Hierbci muss man bcachten (vgl. hierzu die Rossersche Kritik
an der urspriinglichen Formulierung des entsprechenden Axioms A. 4. 5
286 HANS HERMES
in [2]), dass nicht angenommen wird, dass der gesamte dreidimensionale
Raum mit Massenpunkten besetzt 1st. Zwei Inertialsysteme Z\ und Z%
liefern daher Transformationen nur fur solche Quadrupel reeller Zahlen.
zu denen es momentane Massenpunkte gibt, welche in Z\ diese Quadrupel
als Koordinaten haben. Man darf daher nur verlangen, dass die auf diese
Weise gewonnene Koordinatentransformation in einer Galileitransfor-
mation enthalten ist.
In den beiden folgenden Axiomen tret en zwei ,,Verkettungen" auf,
welche zunachst erklart werden solien:
DEFINITION (rjS)irx bedeute VV(/YrtV A Zt'r'x)
r' r'
DEFINITION (Xiz^vrfr bedeute
Damit formulieren wir
AXIOM 3. 1 . ftZ A gair -> »(r/27)
AXIOM 3.2. 93Zi A »272 -> V (galF A
r
Wir wollen die Massen aus Geschwindigkeiten bei Stossversuchen ab-
lesen. Legen wir ein Inertialsystem Z zugrunde, so ist die Geschwindig-
keit eines momentanen Massenpunktes XQ gegeben als lim (r— to)/(r— TO) ;
T-*TO
dabci sei ZTCQTQXQ und es gelte Z\rx fiir denjenigen momentanen Massen-
punkt x, der mit XQ genidentisch ist, und dem in 27 die Zeitkordinate r
zukommt. Die Existenz und Eindeutigkeit von x ergibt sich aus den
Axiomen 2.3 und 2.2. Wir wollen im folgenden unelastische Stosse be-
trachten. Es ist am einfachsten anzunehmen, dass solche Stosse momen-
tan erfolgen und dass sich dabei die Geschwindigkeiten der beteiligten
Massenpunkte unstetig andern. Wir werden daher von Geschwindigkeiten
unmittelbar vor dem Stosse oder kurz Vorgeschwindigkeiten und ent-
sprechend von Nachgeschwindigkeiten reden. Vel- Zbrx soil besagen,
dass im Inertialsystem £ der momentane Massenpunkt x zur Zeit T die
Vorgeschwindigkeit t) besitzt. Analog sei Vel+ Zbrx eingefiihrt. Wir
wollen hier der Kiirze halber auf die explizite Definition von Vel- und Vel+
verzichten und das nachste Axiom nur umgangssprachlich formulieren:
AXIOM 4. Die Massenpunkte besitzen eine stiickweise stetige Geschwin-
digkeit ; an den Sprungstellen existieren wenigstens die Grenzwerte von links
bzw. von rechts (Vor- bzw. Nachgesc hwindigkeit) .
ZUR AXIOMATISIERUNG DER MECHANIK 287
4. Bei einem Stoss treffen sich zwei Massenpunkte an derselben
Stelle. Dies soil hier (im Gegensatz zu der zitierten Abhandlung) aus-
driicklich zugelassen werden, um die Stossgesetze so einfach wie moglich
darsetllen zu konnen. Es muss jedoch bemerkt werden, dass damit das
Prinzip der Undurchdringlichkeit der Materie geopfert wird. (In der zi-
tierten Abhandlung [2J wird keine derartige Annahme gemacht.)
Ein unelastischer Stoss ist dadurch gekennzeichnet, dass die beiden
beteiligten Massenpunkte unmittelbar nach dem Stoss dieselbe Geschwin-
digkeit haben. Diese Geschwindigkeit kann bei Wahl eines geeigneten
Bezugssystems 27 als o angenommen werden. Es werde gefordert, dass die
Geschwindigkeiten unmittelbar vor dem Stoss von o verschieden sind.
Schliesslich muss noch verlangt werden, dass nur die beiden betrachteten
Massenpunkte am Stoss beteiligt sind, d.h., dass sich zur Zeit des Stosses
kein dritter Massenpunkt am Stossort befindet. Stoss ZrXiXz soil bedeu-
ten, dass die zu den momentanen Massenpunkt en x\, x% gehorenden
Massenpunkte zu der in 27 gemessenen Zeit r einen Stoss erleiden, bei
welchem sie (in 27 gemcssen) zur Ruhe kommen.
DEFINITION : Stoss ZrxiXz =Df 9327 A V(27tr#i A 27
r
A Vel+ 27or#i A Vel+
A V V (Vel- 27&iT*i A Vel- 27t>2T*2 A t>i ^ o A b2 ^ o)
A A A(27rry A 27tr#i -> y = x\ v y =
v r
Fur den unelastischen Stoss gilt das Gesetz der Impulserhaltung.
Sind mi bzw. m<z die Massen der beteiligten Massenpunkte und t)i bzw. &2
die in 27 gemessenen Geschwindigkeiten vor dem Stoss, so hat man
wit)i + W2&2 = 0. Damit ergibt sich die Moglichkeit, dass Massen ver-
haltnis a aus dem Verhaltnis der Geschwindigkeitsbetrage zu ermitteln.
Masse OLXXQ soil inhaltlich besagen, dass die Masse von x a-mal so gross ist
wie die Masse des Vergleichsmassenpunktes XQ.
Sind x und XQ genidentisch, so wird ein Stossversuch illusorisch. In
diesem Fall soil das Massenverhaltnis per definitionem gleich eins gesetzt
werden. Wir kommen damit zu der grundlegenden
DEFINITION : Masse OCXXQ =DfVVVVVV (Gxy A Gxoyo A Stoss 27ryyo
2 T II l/o tJ t)o
A Vel- 27t)ry A Vel- 27t>oryo A a-|t)| = |t)o|) v (Gxxg A a = 1)
288 HANS HERMES
Im folgenden sollen weitere Axiome formuliert werden, welche sich
auf die Existenz und die Eindeutigkeit des Massenverhaltnisses beziehen.
Da es hier nur auf eine prinzipielle Diskussion ankommt, soil kein Wert
clarauf gelegt werden, die Axiome so schwach wie moglich zu formulieren.
Fur eine vollstandigen Aufbau der Mechanik ist es natiirlich erforderlich,
iiber die genannten Axiome hinaus noch weitere zu fordern, welche sich
z.B. auf die Giiltigkeit des Impulssatzes beziehen. Ausserdem miisste u.a.
der Begriff der Kraft eingefiihrt werden. Hierzu soil auf [2] verwiesen
werden.
Zunachst formulieren wir ein Axiom, welches zum Ausdruck bringt,
dass es sich bei der soeben eingefuhrten Masse um ein wirkliches Ver-
haltnis handelt.
AXIOM 5. Masse ayz A Masse fax A Masse yxy -> <xfty = 1
Wir formulieren nun cinige einfache Satze. Satz 5 sagt aus, dass das
Massenverhaltnis eindeutig ist, d.h. nur von den beteiligten Massen-
punkten abhangt.
SATZ 1 : Stoss Zrx\X2 -> Stoss ZTX^XI
SATZ 2 : Masse OLXXQ -> a 7^ 0
SATZ 3 : Masse OLXXQ -> Masse — XQX
a
SATZ 4 : Masse OLXXQ A Gxy A GXQJQ -> Masse ayyo
SATZ 5 : Masse OLXXQ A Masse fiyyv A Gxy A Gyoyo -> a = /5
BKWKIS : Satz 1 folgt unmittelbar aus der Definition des Stosses. Satz
2 ergibt sich daraus, dass nach der Definition des Stosses die in Frage
kommendcn Vorgeschwindigkeiten von o verschieden sind. Satz 3 ergibt
sich aus Satz 1 und Satz 2. Satz 4 folgt aus der Massendefinition. Satz 5
zeigt man so : Zunachst hat man Masse ayyo nach Satz 4 und damit Masse
— yoy nach Satz 3. Wegcn Gyy gilt Masse \yy. Nun hat man Masse
1 1
pyyo A Masse - y$y A Masse lyy,also ft- — • 1 = 1 nach Axiom 5.
a a
Das letzte Axiom, welches hier diskutiert werden soil, betrifft die
ZUR AXIOMATISIERUNG DER MECHANIK 289
Existenz das Massenverhaltnisses:
AXIOM 6. V Masse OLXXQ
a
Dieses Axiom bring! zum Ausdruck, dass je zwei verschiedene Massen-
punkte mindestens einmal unelastisch zusammenstossen. Das ist eine
sehr starke Forderung. (Man konnte diese Forderung abschwachen, indem
man nur verl angle, dass es zu je zwei verschiedenen Massenpunkten eine
endliche Kette von Massenpunkten gibt, von denen je zwei aufeinander-
folgende Massenpunkte irgendwann unelastisch zusammenstossen, und
wenn man die Definition des Massenverhaltnisses entsprechend modifi-
zierte. Aber auch ein derartig modifiziertes Axiom wiirde eine starke
Forderung aussprechen.) Zu diesem Axiom (bzw. zu dem analogen Axiom
8.1 in [2]) sagt Rosser in [4], dass verlangt wird, dass die Massenpunkte
"behave in certain very pecular fashions". Axiom 6 mag jedoch weniger
befremdlich erscheinen, wenn man sich vergegenwartigt, dass es entstan-
den ist als eine Formulierung der idealisierten Vorstellung, dass Physiker
Massen durch Stossversuche bestimmen konnen.
Zugunsten der angegebenen Formulierung mag auch noch ein analoger
Sachverhalt aus der Geometric herangezogen werden. Ein geometrisches
Axiom sagt aus, dass es zu jc zwei voneinander verschiedenen Punkten
eine Gerade gibt, welche diese beiden Punkte verbindet. Die geometri-
schen Axiome geben wie die Axiome der Mechanik ursprunglich physika-
lische Sachverhalte wieder. Denkt man an eine Realisierung der Geraden
etwa durch gespannte Seile, so soil durch das genannte Axiom zum Aus-
druck gebracht werden, dass cs zwischen je zwei Raumpunktcn eine der-
artige Seilverbindung gibt. Dies ist vollig analog zu der Forderung, dass
je zwei Massenpunkte im Laufe der Zeit einen unelastischcn Zusammen-
stoss erleiden.
Das soeben betrachtete gcometrische Beispiel gibt aber auch einen
Hinweis darauf, wie man zu einer plausiblen Abschwachung der betrach-
teten Axiome kommen kann. Man konnte namlich sagen, dass die Mog-
lichkeit besteht, zwei beliebige Raumpunkte durch ein gespanntes Seil zu
verbinden. Damit ware das Axiom streng genommen nur eine Moglich-
keitsaussage. Entsprechend konnte man das mechanische Axiom 6 so
abschwachen, dass man nur verlangt, dass es moglich ist, dass je zwei
Massenpunkte irgendwann unelastisch zusammenstossen.
Eine andere Methode, einer so starken Formulierung, wie sie Axiom 6
darstellt, zu entgehen, besteht darin, den Massenbegriff durch eine Re-
duktion (nach Carnap) auf kinematische Begriffe zuruckzufuhren.
290 HANS HERMES
ANHANG: Corrigenda zu [2]:
S. 10, Z. 12 v.u. lies: ,,((0000)*)". S. 10, Z. 5. v.u. lies: „/(*)
S. 13 fiige ein: ,,A4.3': ZZeBzs". S. 13, A4.5 lies: „/(/ elo A
S. 28, Z. 8 v.o. statt: ^vergleichbaren" lies: ,,genidentischen" . S. 31, Z. 6
v.o. statt ,,/s0/" lies ,,/s".
Bibliographic
[1] ADAMS, E. W., The foundations of rigid body mechanics and the derivation of its
law from those of particle mechanics. This volume, pp. 250-265.
[2] HERMES, II., Eine Axiomatisierung der allgemeinen Mechanik. Forschungcn
zur Logik untl zur (Trundlcgung der cxakten Wissenschaftcn. Ncuc Folge, Heft
3, Leipzig 1938, 48 S.
[3] McKiNSKY, J. C. C\, SUGAR, A. C. and SUPPES, P., Axiomatic Foundations of
Classical Particle Mechanics. Journal of Rational Mechanis and Analysis, vol.
2 (1953), pp. 253-272.
[4] ROSSER, B., Review of HERMES [1J. Journal of Symbolic Logic, vol. 3 (1938),
pp. 119-120.
Symposium on the Axiomatic Method
AXIOMS FOR RELATIVISTIC KINEMATICS
WITH OR WITHOUT PARITY
PATRICK SUPPES
Stanford University, Stanford, California, U.S.A.
1 . Introduction. The primary aim of this paper is to give an elementary
derivation of the Lorentz transformations, without any assumptions of
continuity or linearity, from a single axiom concerning invariance of the
relativistic distance between any two space-time points connected by an
inertial path. The concluding section considers extensions of the theory of
relativistic kinematics which will destroy conservation of temporal parity,
that is, extensions which are not invariant under time reversals.
It is philosophically and empirically interesting that the Lorentz
transformations can be derived without any extraneous assumptions of
continuity or differentiability. In a word, the single assumption needed
for relativistic kinematics is that all observers at rest in inertial frames
get identical measurements of relativistic distances along inertial paths
when their measuring instruments have identical calibrations. Note that
it is a consequence and not an assumption that these observers arc moving
with a uniform velocity with respect to each other. Granted the possi-
bility of perfect measurements everywhere of relativistic intervals, this
single axiom isolates in a precise way the narrow operational basis needed
for the special theory of relativity.
Prior to any search of the literature it would seem that this result
would be well-known, but I have not succeeded in finding the proof
anywhere. Every physics textbook on relativity makes a linearity
assumption at the minimum. In geometrical discussions of indefinite
quadratic forms it is often remarked that the relativistic interval is
invariant under the Lorentz group, but it is not proved that it is invariant
under no wider group, which is the main fact established here. Some
further remarks in this connection are made at the end of Section 2.
2. Primitive Notions and Single Axiom. Our single initial axiom for
relativistic kinematics is based on three primitive notions, each of which
has a simple physical interpretation. The first notion is an arbitrary set X
291
292 PATRICK SUPPES
interpreted as the set of physical space-time points. The second notion is a
non-empty family $ of one-one functions mapping X onto ^4, the set of
all ordered quadruples of real numbers. (Thus X must have the power of
the continuum.) Intuitively each function in ^ represents an inertial
space-time frame of reference, or, more explicitly, a space-time measuring
apparatus at rest in an inertial frame. If x e X, f e g, and f(x) =
<xi, x%, #3, /> then x\, x<&, and #3 are the three orthogonal spatial co-
ordinates of the point x, and t the time coordinate, with respect to the
frame /. For a more explicit formal notation, ft (x) is the zth coordinate of
the space- time point x with respect to the frame /, f or i = 1 , . . . , 4. The
third primitive notion is a positive number c, which is to be interpreted
as the speed of light.
It is convenient to have a notation for the relativistic distance with
respect to a frame / between any two space-time points x and y.
DEFINITION 1 . // x, y E X and f e g then
If(xy) =
1 »--=!
(We always take the square-root with positive sign.) If / is an inertial
frame, then (i) If(xy) = 0 if x and y are connected by a light line, (ii)
I'j(xy) < 0 if x and y lie on an inertial path (the square is negative since
If(xy) is imaginary) ; (iii) I(xy) > 0 if x and y are separated by a "space-
like" interval. We use (ii) for a formal definition.
DEFINITION 2. // x, y e X and f eft then x AND y LIE ON AN INERTIAL
PATH WITH RESPECT TO / if and only if Ij(xy) < 0.
It will also occasionally be useful to characterize inertial paths in terms
of their speed. We may do this informally as follows. By the slope of a line
a in 7?4, whose projection on the 4th coordinate (the time coordinate) is a
non-degenerate segment, we mean the three-dimensional vector W such
that for any two distinct points <Zi, t\y and <Z2, t%> of a
By the speed of a we mean the non-negative number | W\. An inertial path
AXIOMS FOR RELATIVISTIC KINEMATICS 293
is a line in R* whose speed is less than c ; and a light line is of course a line
whose speed is c.
The single axiom we require is embodied in the following definition.
DEFINITION 3. A system Hi = (X, Qf, c> is a COLLECTION OF RELA-
TIVISTIC FRAMES if and only if for every x, y in X, whenever x and y lie on an
inertial path with respect to some frame in $, then for all /, f in $
(1) If(Xy) = If.(xy).
I orginally formulated this invariance axiom so as to require that equation
(1) hold for all space-time points x and y ,that is, without restricting
them to lie on an inertial path (with respect to some frame in g). Walter
Noll pointed out to me that with this stronger axiom no physically
motivated arguments of the kind given below are required to prove that
any two frames in $ are related by a linear transformation ; a relatively
simple algebraic argument may be given to show this.
On the other hand, when the invariance assumption is restricted, as
it is here, to distances between points on inertial paths, the line of argu-
ment formalized in the theorems of the next section seems necessary. This
restriction to pairs of points on inertial paths is physically natural
because their distances I/(xy) are more susceptible to direct measurements
than are the distances of points separated by a space-like interval (i.e.,
If(xy) > 0).
3. Theorems. In proving the main result that any two frames in $
are related by a Lorentz transformation, some preliminary definitions,
theorems and lemmas will be useful. We shall use freely the geometrical
language appropriate to Euclidean four-dimensional space with the
ordinary positive definite quadratic form.
THEOREM 1. If k^O and f(x) - f(y) = k[f(u) - f(v)] then If(xy) =
klf(uv).
PROOF: If k — 0, the theorem is immediate. So we need to consider
the case for which k > 0. It follows from the hypothesis of the theorem
that
(1) Xi — yt = k(ui — Vi) for i = 1, . . . , 4,
where, for brevity here and subsequently, when we are considering a
fixed element / of $, ft(x) = xt, etc. Using (1) and Definition 1 we then
294 PATRICK SUPPES
have:
2 (
-l
i 1
- /e//(«v). Q.E.D.
In the next theorem we use the notion of bctweenness in a way which is
meant not to exclude identity with one of the end points.
THEOREM 2. // the points /(#), f(y) and f(z) are collinear and f(y) is
between f(x) and f(z) then
If(xy) + If(yz) = If(xz).
PROOF: Extending our subscript notation, let f(x) = x, etc. Since the
three points xt y and z are collinear, and y is between x and z, there is a
number k such that 0 < k < 1 and
(1) y = ** + (1 - k)z,
whence
y — z = k(x — Z),
and thus by Theorem 1
(2) If(yz) = k!f(Xz).
By adding and subtracting x from the right-hand side of (1), we get:
3; = kx + (1 — k)z + x — x,
whence
x-y = (\-k)(x-z),
and thus by virtue of Theorem 1 again,
(3) If(xy) = (1 - k)If(xz).
Adding (2) and (3) we obtain the desired result:
If(*y) + h(yz) = //M- Q-E.D.
Our next objective is to prove a partial converse of Theorem 2. Since
the notion of Lorentz transformation is needed in the proof, we introduce
AXIOMS FOR RELATIVISTIC KINEMATICS 295
the appropriate formal definitions at this point. ,/ is the identity matrix
of the necessary order.
DEFINITION 4. A matrix s/ (of order 4) is a LORENTZ MATRIX if and
only if there exist real numbers ft, 6, a three-dimensional vector U, and an
orthogonal matrix $ of order 3 such that
52= 1
J/ = ) U2
\0 d/ \ -0C7 /?
(In this definition and elsewhere, if A is a matrix, /I* is its transpose, and
vectors like U are one-rowed matrices — thus U* is a one-column ma-
trix.) The physical interpretation of the various quantities in Definition 1
should be obvious. The number ft is the Lorentz contraction factor. When
6 = — 1 , we have a reversal of the direction of time. The matrix <f
represents a rotation of the spatial coordinates, or a rotation followed by a
reflection. The vector U is the relative velocity of the two frames of re-
ference. For future reference it may be noted that every Lorentz matrix is
non-singular.
DEFINITION 5. A Lorentz transformation is a one-one function <p
mapping R$ onto itself such that there is a Lorentz matrix J/ and a 4-
dimensional vector B so that for all Z in R^
y(Z) = Zj3/ + B.
The physical interpretation of the vector B is clear. Its first three co-
ordinates represent a translation of the origin of the spatial coordinates,
and its last coordinate a translation of the time origin. Definition 5 makes
it clear that every Lorentz transformation is a nonsingular affine transfor-
mation of R$, a fact which we shall use in several contexts. The important
consideration for the proof of Theorem 3 is that affine transformations
preserve the collinearity of points.
THEOREM 3. // any two of the three points x, y, z are distinct and lie
on an inertial path with respect to f and if If(xy) + If(yz) = If(xz), then the
points f(x), f(y) and f(z) are collinear, and f(y) is between f(x) and f(z).
PROOF: Three cases naturally arise.
296 PATRICK SUPPES
Case 1. I2(xy) < 0. In this case the line segment f(x) — f(y) is an
inertial path segment from x to y, and there exists a Lorentz transfor-
mation <p which will transform the segment f(x) — f(y) to "rest", that is,
more precisely, cp may be chosen so as to transform / to a frame /', which
need not be a member of §, such that the spatial coordinates of x and y are
at the origin, the time coordinate of x is zero, and z has but one spatial
coordinate, by appropriate spatial rotation. That is, we have :
/'(*) = <0, 0, 0, 0>,
f'(y) = <o, o, o, y;>,
We shall prove that /'(#)» f'(y) and /'(z) arc collinear. Since <p is non-
singular and affinc, its inverse y~l exists and is affine, whence collinearity
is preserved in transforming from /' back to /.
It is a familiar fact that the relativistic intervals If(xy), If(yz) and
I/(xz) are Lorentz invariant and thus have the same value with respect to
/' as /. Consequently, from the additive hypothesis of the theorem, we
have:
Squaring both sides of (1), then cancelling and rearranging terms, we
obtain :
(2) V- y'* • Vz'S - c*(yt - z'j* =
- z
If y'i = 0, then x and y are identical, contrary to the hypothesis that
I2(xy) < 0. Taking then y'± ^ 0, dividing it out in (2), squaring both
sides and cancelling, we infer:
whence
*; = o,
which establishes the collinearity in /' of the three points, since their
spatial coordinates coincide, and obviously /'(y) is between f'(x) and f'(z).
Case 2. I*(yz) < 0. Proof similar to Case 1.
Case 3. ff(xz) < 0. By an argument similar to that given for Case 1,
we may go from / to a frame /' by a Lorentz transformation which will
AXIOMS FOR RELATIVISTIC KINEMATICS 297
transform the inertial segment f(x) — f(z) to "rest." That is, we obtain:
/'(*) = <0, 0, 0, 0>,
/ \y i ~~~~ \y i> * •> •A*'*
tf(z) = <o, o, o, *;>.
Then by the additive hypothesis of the theorem:
Proceeding as before, by squaring and cancelling, we obtain from (3) :
Squaring again and cancelling yields:
(5) y; V = 0.
There are now two possibilities to consider: either y[ — 0 or z'± — 0.
If the former is the case, then the three points are collinear in R%, for they
are all three placed at the origin of the spatial coordinates. On the other
hand, if z± = 0, then x and z arc identical points, contrary to hypothesis.
Again it is obvious that f'(y) is between f'(x) and f'(z). Q.E.D.
That a full converse of Theorem 2 cannot be proved, in other words
that the additive hypothesis
does not imply collinearity, is shown by the following counterexample:
/(*) = <0, 0, 0, 0>,
f(y) = <!, 1,0, 0>
f(z) =
Clearly, f(x)t f(y) and f(z) are not collinear in /?4, but I/(xy) +If(yz] =If(xz),
that is,
(1) A/2 + V(i _ V2c)2 + 1 - c2 =
For, simplifying and rearranging (1), we see it is equivalent to:
(2) A/2 — 2\/2c + c2 = c - V2
298 PATRICK SUPPES
and the left-hand of (2) is simply
V(c -"V2J2-^- A/2.
(It may be mentioned that the full converse of Theorem 2 does hold for
R<2, that is, when there is a restriction to one spatial dimension.)
We now want to prove some theorems about properties which are
invariant in ^. Formally, a property is invariant in $ if and only if it
holds or does not hold uniformly for every member / of $. Thus to say
that the property of a line being an inertial path is invariant in fj means
that a line with respect to / in $, is an inertial path with respect to / if
and only if it is an inertial path with respect to every /' in $. All geometric
objects referred to here are with respect to the frames in gf.
THEOREM 4. The property of being the midpoint of a finite segment of an
inertial path is invariant in $.
PROOF : Suppose x, y and z lie on an inertial path with respect to / and
(i) t(y] = i/M + */(*),
and thus
f(y) - /M = ir/(-) - /Ml-
Consequently by virtue of Theorem 1
(2) lf(xy)
and similarly
(3) lf(yz) = $
whence
(4) If(Xy) + lf(yz) = /,(«).
Now by the invariance axiom of Definition 3, for any /' in
If>(xy) = If(xy)
If(yz) = If(yz]
Ir(xz) = If(xz).
Substituting these identities in (4) we obtain:
ir(*y) + ir<y*) = '/-(**)•
AXIOMS FOR RELATIVISTIC KINEMATICS 299
Thus by virtue of Theorem 3, f'(x), f'(y) and f'(z) are collinear with f'(y)
between f'(x) and f(z). Moreover, since by the invariance axiom (2) and
(3) hold for /', we conclude f'(y) is actually the midpoint. Q.E.D.
This proof is easily extended to show that the property of being an
inertial path is invariant in $f, but we do not directly need this fact.
We next want to show that this midpoint property is invariant for
arbitrary segments. In view of the counterexample following Theorem 3
it is evident that a direct proof in terms of the relativistic intervals
cannot be given. The method we shall use consists essentially of con-
structing a parallelogram whose sides are segments of inertial paths. A
similar but somewhat more complicated proof is given in Rubin and
Suppes [3].
THEOREM 5. The property of being the midpoint of an arbitrary finite
segment is invariant in £f •
PROOF: Let A = <Zi, /i> and B = <Z2, /2> where A is an arbitrary
segment in 7\^. (The points A to G defined here are with respect to / in
$.) For definiteness assume t\ > t%. We set
and we choose to and t$ so that
, . \Zi - Zt\
to < 1 2
2c
" ' 2c
We now let (see Figure 1)
C = <Zo, A)>, D = <Zo, ^3>>
V
A + B
p
B + D
s~*
2 '
A +C
2
300
PATRICK SUPPES
Fig. 1
Denoting now the same points
with respect to /' in gf by primes,
we have by virtue of this con-
struction in / and the invariance
property of Theorem 4,
(1)
(2)
(3)
(4)
Substituting (2) and (3) into (4)
we have:
E' = i(C' + Z>'),
F' = \(B' + D'),
G' = \(A' + C'),
E' = J(F + G').
Now substituting (1) into the
right-hand side of the last
equation and simplifying, we infer
the desired result :
since by construction E= \(A -\-B) .
Thus the midpoint of an arbitrary segment is preserved. Q.E.D.
THEOREM 6. The property of two finite segments of inertial paths being
parallel and in a fixed ratio is invariant in ^f.
PROOF: Let /(*) - f(y) = k[f(u) - f(v)], with /(*) - f(y) and
f(u) — f(v) segments of inertial paths. Without loss of generality we may
assume k > 1. Let z be the point such that f(x) — f(y) = k[f(x) — f(z)].
We now construct a parallelogram with f(u) — f(v) and f(x) — f(z) as two
parallel sides. By the previous theorem any parallelogram in / is carried
into a parallelogram in /' since the midpoint of the diagonals is preserved.
Thus
but by Theorems 2 and 3
(2) /'(*) - f'(y) = *[/'(*) - /'(*)],
AXIOMS FOR RELATIVISTIC KINEMATICS 301
(for details see proof of Theorem 4), whence from (1) and (2)
/'(*) - /'(y) = *[/» - /W Q.E.D.
As the final theorem about properties invariant in gf, we want to
generalize the preceding theorem to arbitrary finite segments.
THEOREM 7. The property of two arbitrary finite segments being parallel
and in a fixed ratio is invariant in %.
PROOF: In view of preceding theorems, the crucial thing to show is
that if
then
/'(*) - f'(y) = *[/'(*) - /'(*)]•
Our approach is to use an "inertial" parallelogram similar to the one used
in the proof of Theorem 5. In fact an exactly similar construction will be
used; points A to E are constructed identically, where A = f(x) and
B = f(y). Without loss of generality we may assume k > 2, that is,
that f(z) = F is between A and E. We then have that
(1) A-E=^[A-F].
We draw through F a line parallel to CD, which cuts AC at G and AD
at H. (See Figure 2.)
Now ( 1 ) is equivalent to :
(2)
Moreover, by construction
(3) F = i(G + H)
(4) £ = t(C + D)
(5)
(6) H
Since GF//, AGC, AHD and C£D are by construction segments of in-
302
PATRICK SUPPES
ertial paths, by virtue of Theorem
7, we have from (3)-(6) :
(7) F' = $(G' + H1)
(8) E' = -1(6" + /)')
(9) C' =
(10)
Substituting (9) and (10) in (7),
we get :
And now substituting (8) in (11),
we obtain the desired result :
(12)
But now by virtue of Theorem 5
£' = \(A' + B'),
which together with (12) yields:
k
which is equivalent to :
(13)
/'(*) - f'(y) = *[/'(*) - f'(z)\.
The remainder of the proof, based upon considering f(x) — f(y) =
k[f(u) — f(v)], is exactly like that of Theorem 6 and may be omitted. (In
place of Theorems 2 and 3 in that proof we use the result just established.)
Q.E.D.
We now state the theorem toward which the preceding seven have been
directed.
AXIOMS FOR RELATIVISTIC KINEMATICS 303
THEOREM 8. Any two frames in gf are related by a non-singular a/fine
transformation .
PROOF : A familiar necessary and sufficient condition that a transfor-
mation of a vector space be affine is that parallel finite segments with a
fixed ratio be carried into parallel segments with the same fixed ratio.
(See, e.g. Birkhoff and MacLane [1, p. 263].) Hence by virtue of Theorem
7 any two frames are related by an affine transformation. Non-singu-
larity of the transformation follows from the fact that each frame in g is a
one-one mapping of X onto R$. Q.E.D.
Once we have any two frames in $ related by an affine transformation,
it is not difficult to proceed to show that they are related by a Lorentz
transformation. In the proof of this latter fact, it is convenient to use a
Lemma about Lorentz matrices, which is proved in Rubin and Suppes [3],
and is simply a matter of direct computation.
LEMMA 1 . A matrix *£/ (of order 4) is a Lorentz matrix if and only if
0 \ (J 0
We now prove the basic result :
THEOREM 9. Any two frames in 3f are related by a Lorentz transfor-
mation.
PROOF: Let /, /' be two frames in 3f. As before, for x in X, f(x) = x,
/!_(#) — xi, f'(x) — x' t etc. We consider the transformation <p such that
for every x in X, y(x) = xf. By virtue of Theorem 8 there is a non-singular
matrix (of order 4) and a four-dimensional vector B such that for every x
mX
<p(x) = xs/ + B.
The proof reduces to showing that j/ is a Lorentz matrix.
Let
*
And let a be a light line (in /) such that for any two distinct points x and y
304 PATRICK SUPPES
of a if x = <Zi, £i> and y = <Z2, 22>, then
Clearly \W\ = c. Now let
(3) w' = ^t^T'
From (1), (2) and (3) we have:
(4) W
(Zi - Z2)£* + (h -
Dividing all terms on the right of (4) by ti — /2, and using (2), we obtain :
W& + F
(5)
W =
WE* + g
At this point in the argument we need to know that \W\ = c, that is
to say, we need to know that if I/(xy) = 0, then If(xy) = 0. The proof of
this fact is not difficult. From our fundamental in variance axiom we have
that If(xy) > 0, that is,
(6) \W'\>c.
Consider now a sequence of inertial lines «i,
W2, . . . are such that
(7) lim Wn = W.
n-*oo
Now corresponding to (5) we have :
(8) \W'n\ =
whose slopes
+ g
< C.
Whence, from (8) we conclude that if WE* + g ^ 0, then
(9) |^'| = |lhn<|<c.
Thus from (6) and (9) we infer
(10) . \W'\ = c,
AXIOMS FOR RELATIVISTIC KINEMATICS 305
if WE* + g ^ 0, but that this is so is easily seen. For, suppose not. Then
\im(WnE* + g) = 0,
n-voo
and thus
+ F) = 0.
Consequently W& + F = 0, and <W, 1>J2/ = 0, which is absurd in view
of the non-singularity of &/.
Since \W'\ = c, we have by squaring (5) :
* + 2W&F* + |F|2
and consequently
(12) W(®@* - c*E*E)W*
Since (12) holds for an arbitrary light line, we may replace W by — W,
and obtain (12) again. We thus infer:
) = 0,
but the direction of W is arbitrary, whence
(13) 9F* -C2£*g-0.
Now let x = <0, 0, 0, 0> and y = <0, 0, 0, 1 >. Then
But it is easily seen from (1) that
and thus by our fundamental invariance axiom
(14) C2g2 - |F|2 = C2.
From (12), (13), (14) and the fact that \W\2 = c2, we infer:
W(^* - c*E*E)W* = |P7|2,
and because the direction of PF is arbitrary we conclude :
(15) &@*
where J is the identity matrix.
306 PATRICK SUPPES
Now by direct computation on the basis of (1),
A/ 0 \ __ / &&* - c*E*E
<16' V 0 - c2/ " \(@F* - c2£*g)* FF* - c2g2
From (13), (14), (15) and (16) we arrive finally at the result:
-c2/ -c
and thus by virtue of Lemma 1 , j/ is a Lorentz matrix. Q.E.D.
4. Temporal Parity. Turning now to problems of parity, we may for
simplicity restrict the discussion to time reversals. Similar considerations
apply to spatial reflections.
A simple axiom, which will prevent time reversal between frames in ft,
is:
(Tl) There are elements x and y in X such that for all f in $
There is, however, a simple objection to this axiom. It is unsatisfactory
to have time reversal depend on the existence of special space-time points,
which could possibly occur only in some remote region or epoch. This
objection is met by T2.
(T2) // I*f(xy) < 0 then either for all f in ft
f4(x) < My)
or for all f in ft
My) <U(x).
T2 replaces the postulation of special points by a general property : given
any segment of an inert ial path, all frames in ft must orient the direction
of time for this segment in the same way.
Nevertheless, there is another objection to Tl which holds also for T2:
the appropriate axiom should be formulated so that a given observer in a
frame / may verify it without observing any other frames, that is, he may
decide if he is a qualified candidate for membership in ft without ob-
serving other members of ft. (This issue is relevant to the single axiom of
Definition 3 but cannot be entered into here.) From a logical standpoint
this means eliminating quantification over elements of ft, which may be
AXIOMS FOR RELATIVISTIC KINEMATICS 307
done by introducing a fourth primitive notion, a binary relation a of
signaling on X. To block time reversal we need postulate but two proper-
ties of a:
(T3.1) For every x in X there is a y in X such that xay.
(T3.2) // xay then /4(*) < /4(y).
However, a third objection to (Tl) also applies to (T2) and (T3).
Namely, we are essentially postulating what we want to prove. The
axioms stated here correspond to postulating artifically in a theory of
measurement of mass that a certain object must be assigned the mass of
one. I pose the question: Is it possible to find "natural" axioms which fix a
direction of time? It may be mentioned that Robb's meticulous axio-
matization [2] in terms of the notion of after provides no answer.
References
[1] BIRKHOFF, G. and S. MACLANE, A Survey of Modern Algebra. New York 1941,
XH 450 pp.
[2] ROBB, A. A., Geometry of Space and Time. Cambridge 1936, VII -\- 408 pp.
[3] RUBIN, H. and P. SUPPES, Transformations of systems of relativistic particle
mechanics. Pacific Journal of Mathematics, vol. 4 (1954), pp. 563-601.
Symposium on the Axiomatic Method
AXIOMS FOR COSMOLOGY
A. G. WALKER
University of Liverpool, Liverpool, England
1. In relativistic cosmology there is a generally accepted form of
space-time which serves as a geometrical model for the large scale features
of the universe. It is a four dimensional manifold with a quadratic differ-
ential metric
dP — R2da2
where t is a preferential coordinate, 7^ is a function of t only, and da2 is
the metric of a three dimensional Riemannian space C of constant
curvature k. Topologically the space-time is a product T X C where T is
the continuum of real numbers (parametrised by t) and C is a 3-space
which may be spherical or elliptic (k > 0), hyperbolic (k < 0) or euclidean
(k = 0) . Each point x of C represents a fundamental particle (corresponding
to a galaxy in the universe); the curve T x x, an orthogonal trajectory
of the hypersurfaces t = constant in space-time, is the world line of the
particle, and the null geodesies of spacetimc represent light paths. The
natural projection of each such null geodesic into C is a geodesic of C.
In dynamical theories, such as General Relativity, the f onn of space- time
is strictly invariant and the function R is significant in that, through the
field equations, it determines the distribution of matter in the universe.
In kinematical theories, however, space-time is only conformally in-
variant with the result that R can be 'transformed away' by aregraduation
of the time scale, i.e. a transformation of the time parameter from t to r
where dr = dt/R. Ignoring a conformal factor, the metric of space-time
then becomes dr2 — da2 and the model is static. The r-scale giving this
comparatively simple model is unique except for an arbitrary affine
transformation r = ar + b (a > 0) ; with each r-scale there is a 'natural'
measure of distance in C, by fda, and the only effect of a regraduation
T' = ar + b is a change of distance unit by a factor a. With these measure-
ments of time and distance light can be said to have unit speed in C.
The above model, which is common to many theories, was derived first
by Lemaitre and others from Einstein's General Theory, when it was
interpreted as a dynamical as well as a kinematical model. Later it was
308
AXIOMS FOR COSMOLOGY 309
derived by Robertson and the writer independently as a purely kine-
matical model, based on fewer assumptions than in General Relativity,
and this derivation is generally regarded as satisfactory and adequate in
modern cosmological theories. Nevertheless it is far from satisfactory as
an example of the axiomatic method largely because of the initial as-
sumption that events can be described by numerical parameters, i.e. that
the natural topology of a geometrical model of the universe is that of a
manifold. This is a good working hypothesis in that it produces useful
results quickly, but we now wish to base the structure on a more ele-
mentary set of axioms.
We shall be talking about particles (fundamental particles) and the
events in the history of a particle, and the present purpose is to find a
system of axioms from which we can deduce two theorems; firstly, that
the events in the history of a single particle are 'linearly' ordered, i.e.
can be parametrised by a single real parameter; secondly, that the set of
particles can be given the topology and structure of a geodesic metric
space in such a way that the metric has the properties of metric in the
cosmological model. We shall not go all the way in establishing the spheri-
cal, elliptic, hyperbolic or euclidean manifold structures on the set of
particles, but the final stage is not difficult once the geodesic metric
structure is established, using the work of Busemann, Montgomery and
Zippin and postulating sufficient symmetry about each particle. Our
axiomatic system will in fact cut out the spherical and elliptic models
since it will be postulated that the (light) signal correspondence from one
particle to another is one-one. It would need a more complicated system
to include the models of positive curvature and this is not discussed here.
2. Before the axioms are stated the idea of light signals used to such
good affect by E. A. Milne [2] and others will be described briefly. One of
the primitives in the present System, the signal-mapping of one particle
set of events (world-line) on another, is based on this idea, and one of the
axioms appears artificial until it is related to the situation of equivalence
between particle-observers discussed by Milne.
Milne's particle-observer is the set of events in the history of a particle
together with a 'clock', i.e. a numerical parameter giving temporal order
in the set. If A and B are two particle-observers, light signals can be sent
from one to the other; the time of arrival s' at B can be expressed as a
function s' = 0(t) of the time t of emission at A, and the time of arrival t'
at A is a function t' = 8(s) of the time s of emission at B. These 'times' at
310
A. G. WALKER
A and B are recorded by the clocks attached to A and B. The particle-
observers (with their clocks) are said to be equivalent if the functions 0 and
0, called signal functions, are identical, and it can be shown that if A and
B are not equivalent, then B's clock can be regraduated by a transfor-
mation of the form sf = y(s) so that they become equivalent.
Three particle-observers A, B, C are collinear, with B between A and C,
if the light signal from A to C is the same as that from A to B followed by
the signal from B to C, and similarly from C to A. They then form an
equivalent system if they are equivalent in pairs, and it is easily verified
that if 0 and </> are the signal functions between A and B and between A
and C respectively, the condition for this \sO ° (f> = <f> ° 6. A collinear set of
particle-observers equivalent in pairs thus gives rise to a set of commu-
tative signal functions, and from the study of such a set Milne was able
to establish theorems on linear equiva-
lences.
One serious disadvantage of this
treatment is the assumption that particle-
observers can communicate with each
other so that a signal function is assumed
to be knowable. It would be difficult to
embody this assumption in an axiomatic
system and for that reason it will be as-
sumed in the present work that all obser-
vations are to be made by only one observer.
Thus if A is this observer and if A and
B are equivalent in Milne's sense. A cannot
observe the signal function 6 between A
and B but he can observe the function
02 _ Q o 0 since if a light signal emitted
by A at time / is reflected at B and is
received by A at time tf, then t' is an
observable function of t given by t' = 02(t).
Again, if collinear particles A, B, C are
equivalent in Milne's sense and if 0, <f> are
the signal functions between A and B and between A and C as before,
then 0 o y = <p o 0 ; but if A is the only observer, this is not an observable re-
lation. A consequence, however, is O2 ° y2 = <f>2 o O2, and this is an observable
relation since O2 and <p are observable. This simple relation is independent
of the choice of clock scales and can be illustrated as in fig. 1 , where the
Fig. 1
AXIOMS FOR COSMOLOGY 311
vertical lines represent world-lines and the other lines light paths. Although
derived from Milne's idea of equivalence it is in fact weaker, and provides
the main suggestion for our Axiom IX which is equivalence to it when
applied to collinear particles.
3. The primitives of the axiomatic system to be considered here are
events and certain sets of events called particles. The events of one particle
O, called the observer, satisfy a total order relation, described by the words
'before' and 'after' ; if % and y are distinct events of 0, then either x < y
(x is before y, equivalent to y > x, i.e. y is after x) or y < x. Lastly, if A
and B are any two particles, there is a signal-mapping of A onto By
denoted by (ASB).
AXIOM I. The order relation in 0 is transitive, i.e. if x, y, z, are events
of 0 such that x < y and y < x, then x < z.
AXIOM II. Every signal-mapping is one-one.
Thus (A, B) has a single valued inverse (A, B)~l which is a mapping of
B onto A.
DEFINITION. An OBSERVABLE is a mapping 0 -> 0 resulting from a
chain of signal mappings or inverse signal mappings.
One example of an observable is the signal-mapping (0, A) followed by
the mapping (A, 0}\ this will be denoted by (0, A) (A, 0), which is here
more convenient than the usual (A, 0) ° (0, A). Another example is
(0, A)(A, B)(O, B)~l, which will turn out later to be the identity mapping
O -> 0 when 0, A, B are 'collinear'.
All further axioms can now be expressed in terms of observables and
the order relation on 0, but for convenience we shall define and use
relative observables.
By means of the mapping (0, A) and the order relation on 0, an order
relation can be induced on A, and we shall use the symbols <, > and words
'before' and 'after' when describing this relation. An OBSERVABLE
RELATIVE TO A is defined as a mapping A -> A resulting from a chain of
signal mappings and inverses, and an axiom may be expressed in terms of
observables relative to any particle A and the order relation on A. This
is only a matter of convenience; any such expression could always be
restated in terms of proper observables, i.e. observables relative to 0, for
if / is an observable relative to A , then (0, A) f (0, A)~l is a corresponding
proper observable.
312
A. G. WALKER
Let A, B, C be any three particles, and
let g be the observable relative to A defined
as follows (see fig. 2)
g=(A,B)(B,C)(A,C)-i.
AXIOM III. g(a) > a for all events aeA.
AXIOM IV. g is strictly increasing, i.e.
if a, a' are events of A such that a' > a,
then g(a') > g(a).
It is to be understood that in axioms
such as these the particles are not neces-
sarily distinct, i.e. B or C may be a copy
of A , with the convention that the signal
mapping (A, A) is the identity A ->A.
Putting C — A in Axioms III and IV we
thus have that the observable / relative to
A, defined by / — (A,B)(B, A), satisfies the same conditions as g in the
axioms.
We also see from Axiom IV with A = 0 that if B and C are any two
particles, the order relation induced in C from B by means of the mapping
(B, C) is the same as the order induced in C from 0. It follows that the
observer 0 loses its preferential position; the whole system is the same
relative to a 'subordinate observer' at A with the order relation induced
from 0.
It can now be assumed without a further axiom that the ordered set 0 is
closed, and therefore that every particle set is closed, i.e. that every
bounded sequence of events in a particle has a limit. If the particle sets
are not closed, new events can be defined in the usual way as sections or
by sequences, and the sets of new events are closed. Further, the signal
mappings can be extended in a natural way to the new particle-sets so
that the above axioms are still satisfied. It will therefore be assumed that
O, and hence every particle set of events, is closed.
DEFINITION. Particles A and B COINCIDE at the event a e A if f(a) = a
where f = (A,B)(B9A).
We see that if A and B coincide at a e A and if (A , B) (a) = b, then A
and B coincide at the event b e B ; f or we have f(a) = (B, A ) » (A , B) (a) =
(Bt A)(b) and hence (B, A)(A, B)(b) = (A, B) o (B, A)b = (A, B)(a) = b.
AXIOMS FOR COSMOLOGY 313
We could, if we wished to follow the mechanical picture, regard a and
b here as the same event and so allow particle sets to intersect. This is
unnecessary, however, because our particles are restricted to correspond
to what were formerly called fundamental particles and therefore are
required not to coincide, which leads to the next axiom.
AXIOM V. No two distinct particles coincide at any event. *
It follows that no particle set has a first or last event, (assuming that
there is more than one particle ; see Axiom VII) 2. For if a is a last event of
a particle A and B is another particle, then by Axiom III with C = A,
f(a) ;> a where f = (A, B)(B, A) and hence f(a] = a, i.e. A and B coin-
cide at a. Similarly A and B would coincide at a first event of A.
4. DEFINITION. Particles A, B, C are COLLINKAR, with B between A
and C, if
(A, B)(B, C) = (A, C), (C, B)(B, A) = (C, A).
These conditions can be expressed in terms of observables and re quires
the observables relative to A, given by (A, B)(B, C)(A, C)"1 and
(C, A)~l(C, B)(B, A), both to be the identity mapping A -> A. They cor-
respond, therefore, to the case of equality in Axiom III.
AXIOM VI. If A, B, C, D are particles with A, B, C collinear in some
order, A, B, D collinear in some order, and A, B distinct, then A, C, D are
collinear in some order.
It follows from this that the set of all particles collinear, in some order,
with two distinct particles is a linear system which is determined by any
two distinct members. From the 'between' relation in the above definition
and Axiom III we see that a linear system is totally ordered. In particular
we can talk about particles of a linear system being on the 'same side' or
on 'opposite sides' of a member of the system.
1 This axiom is in fact redundant, but to do without it would mean a great deal
of additional work. A theorem equivalent to this axiom is proved in [4] where also
Axiom VIII is weakened.
2 We could, of course, make an exception of first and last events in Axiom V,
and it would then follow that if one particle has, for example, a first event then all
particles coincide at this event, which is what happens in Milne's model with the
/-scale of time. However, extreme events of this kind can be excluded without
affecting the system and the form of Axiom V given here appears to be preferable.
314 A. G. WALKER
DEFINITION. A linear system L is DENSE at a particle A e L if, for any
two events a, a' of A with a' > a, there is a particle B e L distinct from A
such that (A, B)(B. A) (a) < a'.
It is not difficult to prove that, if L is dense at A, there is a sequence
{An} of particles in L and on the same side of A such that, for every event
a e A, fn(a) -> a as n -» oo where fn — (A, An)(An, A). We shall write
An-+ A.
It should be noted that this definition of denscness in a linear system is
stronger than denseness in the ordinary sense for a totally ordered set
since it involves the ordered set A of events. One consequence, of course,
is that L is dense at A in the ordinary sense that if B, C are members of L
with A between them, then there is another member of L which is distinct
from A, B, C and between B and C. Another consequence is as follows.
If there is a linear system of particles which is dense at some member
then the particle set O of events (and hence every particle set) is continu-
ous in the sense that if x, y are any two events of 0 and x < y, there is an
event z of 0 such that % < z < y.
AXIOM VII. There are at least two distinct particles.
Hence there is at least one linear system of particles.
AXIOM VIII. Every linear system of particles is dense at every member 3.
An immediate consequence of this axiom, as remarked above, is that
every particle set of events is continuous. We are now in a position to
prove the following theorem.
The particle set 0, and hence every particle set, is ordinally equivalent to the
continuum of real numbers.
It is sufficient to prove this for any one particle A , and since we already
have that the set A is closed, it is sufficient to prove that there is an
ennumerable subset of A such that between any two events of A is a
member of the subset.
Let L be a linear system containing A . Then L is dense at A by Axiom
VIII and there is a subset {An} of L such that An -*A. As before we
write fn for (A, An)(Anf A) and define fnv for any integer p in the usual
3 Because of an axiom of symmetry which comes later (§ 6), Axiom VIII could be
weakened to state that every linear system is dense at some member ; it would then
follow from symmetry that the system is dense at every member. The present axiom
is chosen, however, so that certain theorems can be proved immediately. (Cf. [4].)
AXIOMS FOR COSMOLOGY 315
way; thus fnQ is the identity mapping A -+A,fnl = fn, fnv+l = fn <> fnP,
Let a be an event of A , and consider the subset of A given by the events
fnp(a) where n takes all positive integer values and p takes all integer
values. This subset is ennumerable, and we shall prove that if x, y are
any two events of A and % < y, there is a member of this subset between
x and y. It will be sufficient to consider the case a < x < y; the proof for
the case x < y < a is very similar, and the case x < a < y is trivial.
Let x, y be events of A and a < x < y. Then since A n -> A , there is an
n such that fn(x) < y. Keeping n fixed, consider the sequence fnm(a)
where m takes positive or zero integer values. This sequence is unbounded
as m -> oo, for if fnm(<*>) -> z as m -> oo then fn(z) = z and An coincides
with A at the event z, which contradicts Axiom V. Hence, since a < x.
there is a positive or zero integer m such that
fnm(a) <oc </nmH1(«)-
The mapping / is increasing by Axiom IV and hence
* < fnm+1(a) = /.(/."(a)) < /„(*) < y
i.e. the event /nm+1(a) in the ennumerable subset of A is between x and y,
as required.
The proof for x < y < a is similar but with inverse signal mappings
/.-».
This theorem shows that every particle set can be mapped onto the
continuum of real numbers so that order is preserved in the sense that
'before' corresponds to 'is less than'. Such a mapping will be called a
clock, and the real number corresponding to an event is a clock reading.
The mapping is not, of course, unique, and a change of mapping may be
called a 'clock regraduation' ; it corresponds to a transformation t' = y(t)
of the 'time' parameter t, where y) is a continuous increasing function
taking all values.
When a particle A is provided with a clock in this way, every observable
relative to A can be represented as a function of the time parameter, and
from the axioms it follows that all such functions are continuous and
increasing and take all values. If / is such a function, it is transformed into
^ o / o y-i when A's clock is regraduated by tr = y>(/).
5. The next axiom was suggested by a property of Milne's equivalent
system of collinear particle-observers (see § 2) but applies to any three
particles, not necessarily collinear.
316 A. G. WALKER
AXIOM IX. If A, B, C are any three particles then (A : B, C) =
(A : C, B) where (A : B, C) denotes the observable (A, B)(B, C)(C, B)(At B}~^
relative to A.
If A, Bt C are collinear with A not between B and C this axiom is
seen to be already satisfied. If however A is between B and C, then
(C, B)(A, B}-i = (C, A) and the axiom gives
(A, B)(B, C)(C, A) = (A, C)(C, B)(B, A)
i.e.
(A, B)(B, A)(A, C)(C, A) = (A, C)(C, A) (A, B)(B9 A),
showing that the two observables (A, B)(B, A) and (A, C)(C, A) relative
to A commute.
Again, if A, B, C are collinear with B between A and C, then the axiom
applied to B, A, C in this order gives
(B, A)(A9 C)(C, B) = (B, C)(C, A)(A9 B)
and hence
(A, B)(B, A)(Af C)(C, B)(B, A) = (A, B) (B, C)(C, A) (A, B)(B, A)
i.e.
(A. B)(B, A)(A, C)(C, A] = (A, C)(C, A) (A, B)(B, A)
since (C, B)(B, A) = (C, A) and (A, B)(B,C) = (A,C). Thus the ob-
servables (A, B)(B, A) and (A, C)(C, A) relative to A commute as before.
A similar result occurs if C is between A and B. Hence:
IiA,B,C are collinear in some order, the observables (A,B)(B,A)
and (A, C)(C, A) relative to particle A commute.
Consider now a linear system L of particles containing A , and for X e L
denote by fx the observable (A, X)(Xt A] relative to A. Then from what
has just been proved, if X, Y are any two members of L,
fx ° IY = fv ° fx.
Suppose now A is assigned a clock, i.e. a 'time' parameter t\ then ob-
servables such as fx are represented by continuous increasing functions
1x(t] taking all values, and any two such functions corresponding to
members of L commute. We thus have a system L of commutative
functions which, because of the denseness of L at A , contains a sequence
which converges uniformly to the identity. Hence, from a theorem on sets
AXIOMS FOR COSMOLOGY 317
of commutative functions [3], there exists a continuous increasing func-
tion y(t), taking all values, such that every function fx^L can be ex-
pressed in the form.
fx(t) = ^{2dx + V(0}
where dx is a positive or zero constant depending upon X.
If now A's clock is regraduated to read time T where r = y(t), the
observable function fx(t) is transformed into the function T + 2dx- We
have thus proved that :
If L is a linear system of particles containing A , a clock reading time r
can always be assigned to A so that, if X is any member of L, the ob-
servable (A,X)(X,A) relative to A is represented by the function
T + 2dx where dx is a positive or zero constant depending upon X. Such
a clock will be called a T-CLOCK relative to L.
It also follows from the theorems on commutative functions that A's
r-clock is determined uniquely by the linear system except for an arbitrary
affinc regraduation r' = ar + b, a > 0. The only effect of such a re-
graduation on the observable functions r + 2dx is to multiply all the
constants dx by the same factor a.
We now define the DISTANCE d(At X) from A to X to be dx- We
observe that d(AtX) > 0, and equality occurs when and only when
X — A. In terms of readings on A's r-clock the distance d(A, X) is given
by
where r has any value and f — fx(r), fx = (A, X)(X, A). This formula
indicates again how the 'scale' of distance depends upon the choice of
r-clock ; under the allowable change of r-scale given by T' = ar + b we get
d'(A, X) = i(f' - r') = Jfl(f - T) = ad(A, X).
When a r-clock has been assigned to A , a clock can be assigned to any
other particle X e L by taking the clock reading at the event (A , X) (T)
to be T + dx, and it can easily be verified that this parametrisation of X
is for X a proper r-clock relative to L.
DEFINITION. The r-clock assigned to X as above is EQUIVALENT to
A's r-clock.
It can be verified that for all particles of L and r-clocks relative to L,
equivalence as defined here is reflexive and transitive. Also, for any
318 A. G. WALKER
particles X, Y of L, the distance d(X, Y) measured in relation to a r-clock
of X (relative to L) is equal to the distance d(Y, X) measured in relation
to the equivalent r-clock of Y, and for any three particles X, Y, Z of L,
with Y between X and Z,
d(X, Z) = d(X, Y) + d(Y, Z)
where distances are measured in relation to equivalent r-clocks relative
toL.
If a r-clock undergoes an additive regraduation rf = r + b, it becomes
a r-clock and distances measured in relation to it are unaltered. However,
if one of two equivalent r-clocks undergoes this regraduation with 6^0,
it ceases to be equivalent to the other, and we shall say that the clocks are
then congruent.
DEFINITION. // r-clocks relative to L are attached to two particles of a
linear system L and are equivalent to within additive regraduations , they
are CONGRUENT.
From the properties of equivalence it follows that for all particles of L
and r-clocks relative to L, congruence is reflexive and transitive. Also,
the distance relations d(X, Y) - d(Y, X), d(X, Z) = d(X, Y) + d(Y, Z)
for particles of L hold when distances are measured in relation to con-
gruent r-clocks.
6. We now wish to extend this idea of distance from a linear system to
the whole system of particles and so establish a metric on the 'space' of
particles. For this we need the general form of Axiom IX together with a
new axiom of symmetry.
Consider first a mapping p of the set of particles onto itself which leaves
a particle A invariant. An observable / relative to A is a mapping A -> A
determined in some way by a sequence of particles B, C, . . . , and the
transform of / under p may be defined as the observable A ->• A de-
termined in the same way by the sequence B', C', ... where B' = p(B),
etc.
DEFINITION. An ^-TRANSFORMATION is a one-one mapping of the set
of particles onto itself which leaves A invariant and is such that every
observable relative to A is transformed into itself.
Since the property of collinearity of particles can be defined in terms of
observables relative to any particle A, it follows that a linear system of
particles is mapped onto a linear system by any ^4-tranfsormation. Also,
AXIOMS FOR COSMOLOGY 319
if L is a linear system containing A and if L is mapped onto L' by an A-
transformation, a r-clock of A relative to L is also a r-clock relative to L'.
If X, Y are members of L and d(X, Y) the distance between them in
relation to a r-clock of A, and if X', Y' are the images of X, Y under the
^4-transformation and d(X', Y') the distance between them in relation to
the same r-clock of A, then d(X', Y') = d(X, Y).
DEFINITION. A HALF-LINE at a particle A is part of a linear system
containing A ; it consists of A and all the particles on one side of A .
Thus a linear system containing A is the union of two half-lines at A .
If B is a particle distinct from A, there is just one half-line at A which
contains B.
AXIOM X. // A is any particle and I, m any two half -lines at A, there is
an A -trans formation which maps I onto m.
We can now talk of any observer A being assigned a r-clock without
reference to any particular linear system containing A ; a r-clock relative
to one such linear system will also be a r-clock relative to any other be-
cause of Axiom X and the property of an A -transformation mentioned
above. Thus to every observer can be assigned a r-clock which is unique
to within an arbitrary affine regraduation.
The definition of congruence of r-clocks given in § 5 applies to any two
observers, and we can now prove that this congruence is transitive, i.e. if
to any three observers, A , B, C are assigned r-clocks such that those of A
and B are congruent and those of A and C are congruent, then those of B
and C are congruent. To prove this it is sufficient to prove that the
distances d(B, C) and d(C, B), measured in relation to the r-clocks
assigned to B and C respectively, are equal. This is a consequence of the
formula \(r — r) for distance in terms of r-clock readings and the de-
finition of congruent r-clocks applied to the pairs A, B and A, C, for we
find that, for any number T,
d(B, C) = \{(A :A,C)(r)- r}t d(C, B) = }2{(A : C, B)(r) - r]
and these are equal by Axiom IX.
Thus r-clocks can be assigned to all particles so that they are congruent
in pairs, and if a particular r-clock is assigned to one particle, say 0, the
congruent r-clock attached to any other particle is unique to within an
arbitrary additive regraduation r' = r + b. Such a regraduation does not
affect measurements of distance and hence the distance d(X, Y) between
320 A. G. WALKER
any two particles X, Y is uniquely determined. If O's r-clock undergoes an
affine regraduation r' = ar + b, a > 0, then all distances are multiplied
by the same factor a. We have thus defined a METRIC on the set of par-
ticles, and it is easily verified that all the fundamental properties of a
metric are satisfied. For example, the triangular inequality
d(A,C) <d(A,B) +d(B,C)
is mainly a consequence of Axiom III, for we have in terms of ^4's r-clock
and for any number r, 2d(B, C) = f — r where
f=(A, B)(B, C)(C, B)(A, B)-i(r)
Also,
2d(A,B) = (A,B)(B,A)(f) -f
and hence
2d(A, B) + 2d(B, C) = (A, B)(B, C)(C, B)(B, A)(r) - r.
Since by Axiom III, (A, B)(B, C)(r) > (A, C)(r) = rf say, and
(C, B)(B, A)(r') > (C, A)(r') we have from Axiom IV
2d(A, B) + 2d(B, C) > (C, B)(B, A)(rf) -r>(C, A)(rf) - r
i.e. 2d(A, B) + 2d(B, C) > (A, C)(C, A)(r) -r = 2d(A, C}
as required.
The structures we have given to the set of particles is not merely that
of a metric space ; it is that of a geodesic metric space as defined by Buse-
rnann [1], for it is easily verified that the linear systems of particles are
geodesies of the metric space and have the properties required for a
geodesic space. We have thus reached our second objective.
7. By axiom X the geodesic space of particles is symmetric in the
sense that for any particle A and two half-geodesies at A, there is an
isometry of the space which leaves A invariant and maps one half-
geodesic on the other. From what is already known about geodesic
spaces it would not be difficult to select further axioms to ensure that the
space is 3-dimensional hyperbolic or euclidean. For example, we could
define a ROTATION about A as an isometry which is either the identity
mapping or leaves one and only one geodesic through A point-wise invari-
ant; then replace '/I -transformation' by 'rotation* in Axiom X and
postulate that the set of all rotations about A is a group.
AXIOMS FOR COSMOLOGY 321
The final task in the derivation of the space-time model is to establish r
as a 'cosmic' coordinate, i.e. to show that all the particles can be assigned
r-clocks which are not merely congruent but also equivalent to each
other. We then have the product structure T x C on the set of all events,
C being the space of particles, and it is a straightforward matter to define
a metric on T x C, determine light paths (defined in terms of linear
systems and signal mappings) and so complete the features of the cos-
mological model.
Equivalent r-clocks have already been defined and it is an open question
whether the transitivity of this equivalence for all particles is a conse-
quence of the axioms already given. If necessary it is a simple matter to
find an additional axiom which gives the property of transitivity. For
example, if A, B, C are any three particles with r-clocks such that those
of A and B are equivalent and those of A and C are equivalent, it can be
verified that the observable functions (A, B)(B,C)(C, A)(r) and
(A, C)(C, B)(B, A)(r) are both of the form r + constant, and that they
are the same function if and only if the r-clocks of B and C are equivalent.
It would be sufficient, therefore to postulate:
AXIOM XL If A, B, C are any particles,
(A, B)(Bt C)(C, A) = (A, C)(C, B)(B, A).
It is possible, however, that this is a consequence of the previous
axioms.
Bibliography
[1] BUSEMANN, H., The Geometry of Geodesies. Academic Press Inc. New York,
1955, X + 422 pp.
[2] MILNE, E. A., Kinematic Relativity. Oxford 1948, VII -f 238 pp.
[3] WALKER, A. G., Commutative functions, I. Quarterly Journal of Mathematics
vol. 17 (1946) pp. 65-82.
[4] , Foundations of Relativity. Proceedings of the Royal Society of Edinburgh.
vol. 62 (1948) pp. 319-335.
Symposium on the Axiomatic Method
AXIOMATIC METHOD AND THEORY OF RELATIVITY
EQUIVALENT OBSERVERS AND
SPECIAL PRINCIPLE OF RELATIVITY
YOSHIO UENO
Hiroshima University, Hiroshima, Japan
1. Axiomatization of Relativity Theory. Roughly, speaking there are
two different approaches when we try to examine the foundation of
relativity by means of axiomatic methods. In the first approach one tries
to axiomatize the theory of relativity as it is now. According to the second,
one does not necessarily aim at deriving the present theory. Rather, one
investigates various possible ways of axiomatizing the theory of relativity,
in the hope that one will be able to examine prospective forms of new
theories.
In the first approach, one postulates at the beginning the present
relativity theory as the firmly established theory and asks what set of
axioms is equivalent to the theory. Most of the works clone so far has
taken this approach. Certainly, most people accept general relativity as
well as special relativity as firmly established theories, just like classical
mechanics and electrodynamics.
However, one needs to reinvestigate some of the fundamental concepts
of relativity such as space-time, scale, clock and equivalence of observers,
although they are now regarded as completely established beyond any
doubt. For instance, the fact that the so-called clock paradox is still
discussed today indicates that there remains some ambiguity about the
definition and interpretation of an observer or a moving clock.
Furthermore, we know some examples of peculiar structure of space-
time as shown by Godel's peculiar cosmological solution [1] and also by
another peculiar solution due to Nariai [2]. We cannot reject these peculiar
solutions only from fundamental principles of relativity. This may be
again a reason for reinvestigating fundamental principles of relativity.
Of course, to these peculiar solutions, the respective authors gave physical
interpretations which seem reasonable. However, to insure the validity of
such interpretations, we will have to understand clearly the fundamental
principles of general relativity. It is beyond any doubt that axiomatic
322
RELATIVITY AND EQUIVALENT OBSERVERS 323
methods are very useful for the study of this kind. I will not, however, go
into details of such studies here.
Comparing these two alternative approaches, we may say that while
logical formulation is the central problem of the first, heuristic con-
siderations play the main part in the second. Namely, according to the
latter viewpoint the main subject will be to examine in what forms one
can formulate the fundamental concepts of relativity.
From now on I want to deal with the second approach of axiomatic
formulations, namely, how to formulate physical principles of relativity.
In this approach, we are not anticipating the reproduction of special and
general relativity in their present form and content. Rather, my main
concern will be how one can possibly change their content.
Then, what would be the fundamental concept that I should examine
first? One may start from considering the relation between matter and
space-time. Or one may consider first observers and invariance of physical
laws. The latter was the main subject of the work on equivalent ob-
servers, which I did with Takeno [3] , and also of my work [4] on equivalent
observers in special relativity. I shall deal mainly with the subject of
observers and their equivalence. Most of the content of this paper is from
the papers I just mentioned.
2. Equivalent Observers. In general relativity, matter and space-time
are specified by each other, and this is one of the basic characteristics of
the theory. In special relativity matter does not affect directly the
structure of space-time. There, the space-time is independent of the
presence of matter and is an external element which defines modes of
existence of physical phenomena. In special relativity, such modes of
existence of physical phenomena are determined in reference to the state
of an observer.
It is for this reason that we brought up the concept of observers as the
starting point of our work. We considered first the existence of an
observer and discussed its kinematical aspect. Following the work by
Takeno and Ueno [3], I will explain how this was actually done. The first
postulate we made was the existence of a three dimensional space frame
and a one dimensional time frame for an arbitrary observer. We ex-
pressed the postulate in the following way:
PI. Any equivalent observer M is furnished with a three-dimensional
'space-frame S with origin M and a one-dimensional 'time-frame' T, and
324 YOSHIO UENO
can give one and only one set of space coordinates (x, y, z) and time coordinate
(t) to any point event E to within frame transformation.
Let me first explain what is meant by frame transformation. We
regard two observers relatively at rest as essentially identical. And we
call frame transformation the transformation between the frames of
identical observers, that is, the frames relatively at rest to each other as
well as such transformations of the time axis that simply change the scale
of the time frame, namely, regraduation.
The postulate requires that an observer can give to an event a set of
four real numbers representing coordinates (x, y, z, t) which is uniquely
determined to within frame transformation.
It follows that there exists a relation between the coordinates (x, y, z, t)
given to an event by an observer and the coordinates (x't y', z', t') given
to the same event by another equivalent observer in his own frame.
The relation is
0, (», / = 1, 2, 3, 4),
(*i, *2,*a,*4) =(x,y,z,t).
In the above PI, we assumed the existence of a three-dimensional space-
frame and a one-dimensional time-frame. However, it is not necessarily
required that the two frames be combined to form a four-dimensional space-
time. In this sense, this postulate may not be relativistic. Therefore,
the postulate can cover both relativistic and non-relativistic theories.
Namely, the postulate is not characteristic of relativistic theories. In fact,
there are some transformation groups for which we can find no four-
dimensional space-times satisfying the postulate of equivalency.
The second postulate we make requires that any observer can observe
another observer. PI permits an observer to assign a set of coordinates to
any point event, but it does not necessarily follow from this that the
observer can do the same to another observer. The second postulate is
necessary for this reason. It is the following.
PI I. Any observer M can observe all other equivalent observers and they
are all in motion relative to M.
Questions may arise as to what is meant by being in motion. Here as
in the ordinary case, we say an observer is in motion relative to M if the
spatial coordinates of the observer, (x, y, z), are changing with time t.
•The third postulate is a very important one.
RELATIVITY AND EQUIVALENT OBSERVERS 325
PHI. The group of frame transformations ©o is given by the rotations
#! = zdy — ydz, R2 = xdz — zdx, 7?3 = ydx — xdy
and the translations
TI = dx, T2 = dy, 7"3 = Bz
of the space frame SM of M and the translation U — dt of time frame TM of
M. And Ms ©o together with the set of transformations given by (I) forms a
continuous group of transformations ©.
The first question concerning this postulate will be why this particular
transformation was chosen as the frame transformation. In our work we
use coordinates without attaching any special meaning to them. Mathe-
matically that should be satisfactory. However, we must examine the
physical meaning of coordinates in order to compare the theory with the
actual world in some way, or to apply the theory to observations of
phenomena.
If the above mentioned frame transformation A^, TI, U can be inter-
preted as expressing the isotropy and homogeneity of space and the
stationary character of time, then quite naturally, we can regard (x, y, z)
as the cartesian coordinates of the euclidean space and t as the coordinate
of time flowing uniformly. Certainly three-dimensional Riemannian space
whose fundamental tensors are form invariant under coordinate trans-
formations RI and TI is euclidean. This can be easily confirmed. In our
work we have not assumed the metrical structure of space and time. Here
we shall, however, postulate tentatively that the physical world forms
four dimensional space-time. There may exist several ways to determine
the structure of this space-time. Here we shall take, as an example, the
following one tentatively.
Let us first notice that the following postulate we shall take here
completely determines the structure of space-time in which equivalent
observers can exist, and also the scale and the clock of that space-time.
Namely, we postulate that the metric ds of the space-time be form
invariant under the group © which is composed of the frame transfor-
mation ©o and the transformation among equivalent observers as given
by eq. (1). That is to say, we require that the space-time has the metric
ds2 given by
ds* = gijdxtdx* (»,/ = 1,2,3,4)
with gij which is form invariant under ©. Then, the laws of nature, if they
326 YOSHIO UENO
can be expressed as tensor equations, will be form invariant under @j.
Thus, the laws of nature will assume the same expression for equivalent
observers. This is the actual meaning of the equivalent observers.
We should also notice the following. Namely us will be shown later by
an example, we found that for certain @j's, there exists no four-dimensional
space-time of the nature mentioned just now. In such cases, any two
observers connected by 03 in any four dimensional space-time whatsoever,
will not be equivalent in the above sense. In such cases, we may take a
viewpoint different from that of usual relativistic theories and say that
there exists no four-dimensional space-time. How to interpret such an
extraordinary case must be determined in each case.
Now let us return to the main story. The next postulate is:
PIV. // M and M' are any two equivalent observers, they are in radial
motion with respect to each other, and, furthermore, if M observes any E on
the straight line MM' , then M' also observes the same E on the straight line
M'M, independently of each time coordinate t and t' . Here, a straight line
means the set of all the points invariant under any rotation of S.
Implicit in this postulate is an assumption that we can treat three-
dimensional space in analogy with one-dimensional space. Certainly this
assumption will be natural. However, there are things characteristic of
one-dimensional space. Therefore, we need to be careful.
Here I shall only mention the results obtained from the postulates I
discussed so far, and shall not explain the actual calculations we did.
We found that the transformations between equivalent observers thus
obtained were classified into the following three types. They are:
(a) Lorentz-type transformation
x' = (x — vt)/V\ — av2, y' = y, z' = z, t' = (t — avx)/V\ — av2.
(b) Galilei transformation.
x' = x — vt, y' = y, z' = z, t' = t.
(c) ^-transformation (as named by Takeno).
%' = x — v exp(fltf), y' = y, z' = z, t' = t.
It is very interesting that we obtained Lorentz-type transformation
without any assumption on relative motion of observers. I will discuss
this point later. Here I shall discuss the /C-transformation. A characteristic
feature of this transformation is that a point at rest in system 5' moves
RELATIVITY AND EQUIVALENT OBSERVERS 327
in (ST) system with the velocity proportional to the distance between the
two origins of S and S' systems. Namely, we obtain from the above
equation
[<*#/<#] aj'=const. = «Ms'-0.
This relation reminds us of the velocity distance relation of nebular
motion. If we choose a as Hubble's constant, this expression can be
interpreted as the Bubble's relation in steady-state theory due to Bondi
and Gold [5]. Assuming that we regard the postulate PHI as expressing
the isotropy and homogeneity of space as well as the uniformity of time,
it may be interesting to consider the relation between the assumption of
invariance of the laws of nature for /f -transformation and the perfect
cosmological principle in the steady-state theory of cosmology. Thus we
may say that PHI satisfies in a sense the conditions required by the
perfect cosmological principle. In other words, we may say that PHI
expresses the essential content of the perfect cosmological principle.
Furthermore, there arc many questions concerning the /^-transformation
like: what invariant relations do we have under this transformation? or
what kind of dynamics corresponds to this transformation ? We are now
studying the applicability of the transformation to cosmology.
Lastly we shall remark on some problems concerning the structure of
space and time. An especially remarkable feature of the /^-transformation
is that there exists no four-dimensional space-time of which the metric
is form invariant under the group @J comprising the /^-transformation.
Hence, it is not proper to imagine in the above stated sense a four-
dimensional space-time as the background in which we consider equivalent
observers connected by the /^-transformation. We, therefore, expect that
a cosmology completely different from the relativistic one will come out if
we adopt this transformation.
3. Equivalent Observers in Special Relativity. Now I want to change my
subject to the work I did on equivalent observers in special relativity.
The main problem is how to axiomatize fundamental principles of special
relativity. Let us consider first the special principle of relativity. How to
express this principle differs somewhat from person to person. Here, I
borrow from the statement by Einstein himself [6].
// K is an inertial system, then every other system Kr, which moves uni-
formly and without rotation to K, is also an inertial system: the laws of
nature are in concordance for all inertial systems.
328 YOSHIO UENO
The principal concepts which should be examined in this principle are
the following: first, inertial system and uniform motion; then what is
actually meant by the statement that the laws of nature are in concor-
dance for all inertial systems. In my paper [4], I discussed mainly this
principle and did not touch the principle of constancy of light velocity.
Now we shall try to axiomatize the special principle of relativity. At the
beginning we postulate the existence of observers, space-frame and time-
frame. First, we make the same postulate as PI we gave before in Section
2. We shall call it AI here.
AI. Any equivalent observer M is furnished with a three-dimensional
'space-frame' S with origin M and a one-dimensional 'time-frame' T, and
can give one and only one set of space coordinate (x, y, z) and time coordinate
(t) to any point event E to within frame transformation.
By this postulate, it becomes possible to correspond a set of space
coordinate (x, y, z) and time coordinate (t) to any point event. The
postulate specifies three dimensionality of space and one dimensionality
of time. An important conclusion of relativity tells that the space and
time cannot be separated as two independent objective entities. However,
it does not follow from this conclusion that the space and time cannot be
separated for each individual observer. Hence, our postulate is not in
contradiction with the existence of the space-time in relativistic sense.
From AI we can conclude that there exist different observers and co-
ordinate transformation between their space and time frames.
Next we adopt PII stated in Section 2 and call it All.
All. Any observer M can observe all other equivalent observers and they
are moving relative to M.
Thirdly, we postulate the existence of uniform motion. This is the
central point of the theory.
AIII. There exist point events which move uniformly.
Instead of postulating the existence of uniform motions as done here,
we could have postulated the existence of clock and scale to define the
structure of space and time, and could have obtained the same result.
However, we want to use only kinematical. concepts at the beginning.
Now questions arise as to what objects make uniform motion and also as
to how one can recognize uniform motion. The answer to these could be
given by introducing dynamical concepts. For instance, one could define
RELATIVITY AND EQUIVALENT OBSERVERS 329
uniform motion from the absence of external forces. However, if we want
to proceed following this line of thought, dynamical aspects must be
postulated first. Here we shall not, however, do this. The actual problem
here will be how to express the uniform motion in the space-and-time-
frame. Next we shall consider this problem.
By AI each observer was given a space frame and a time frame.
However, there still remained the degree of freedom of the frame transfor-
mations. Using this freedom, we shall choose the space and time frames
so that we can express uniform motion in a simple way.
DEFINITION 1. We call a coordinate system a NORMAL FRAME if the
coordinates (x, y, z, t) of a 'point event in uniform motion satisfy the following
relations in this frame.
(2) x --= vxt + cx, y -= vyt + cy, z = vzt + cz.
Here v's and c's arc constants. By these relations, we have now an ex-
pression for uniform motion. Now we shall consider the frames which
are in uniform motion. In the following, we shall exclusively deal with
normal frames.
DEFINITION 2. // any point at rest in frames (S'T') of an observer A/'
has always the coordinates that satisfy the relation (2) with the same ^'s in
frame (ST) of another observer M, then frame (S'T') is IN UNIFORM MOTION
RELATIVE TO (ST).
The existence of such a normal frame can be a question. That is to
say, we are given the uniform motion by postulate, but it is not guaranteed
that we can always find a frame in which we can express the uniform
motion by equation (2). Hence, we shall assume the existence of a normal
frame.
AIV. To each observer, there exists a normal frame.
The next axiom is a keypoint of the special principle of relativity.
AV. Any normal frame which can be obtained by frame transformation
from a normal frame (ST) or any normal frame which is moving uniformly
relative to (ST) is equivalent to (ST) .
The word "equivalent" used in the above AV means that the laws of
nature are in concordance for the frames under consideration. We shall
postulate the following set of axioms for equivalency. These hold for the
330 YOSHIO UENO
usual equality relation. Writing A = B to express that A is equivalent to
B, we shall postulate the following relations:
AVI. Axiom of equivalence.
(i) A=A,
(ii) if A -B, thenB == A,
(iii) t/ A ^ B and B ^- C, /Aen ,4 = C.
From the above AVI we can easily derive the following theorem.
THEOREM I. Coordinate transformations between equivalent frames form
a group.
From the above axioms we can obtain the explicit form of the co-
ordinate transformation from one normal frame to another. It is
(3) *'« = afx* + c*, det(«;«) ^ 0, (/, / = 1 , 2, 3, 4).
Here as and c's are constants. As is well known, these transformations
form the affine group. Therefore we obtain the following theorem.
THEOREM 2. The set of transformations between normal frames forms
the affine group.
Evidently, this group includes as a sub-group the group of frame transfor-
mations.
If we further want to derive the constancy of the velocity of light, we
have to define clock, scale or the metrical structure of space and time.
By suitable stipulation of these concepts, we shall obtain the Lorentz
transformations.
Before proceeding further, I want to come back to the problem of how
to define uniform motion. The linear form we adopt was of course in
direct analogy with euclidean space. Of course, there is no a priori reason
for euclidean space. However, that the euclidean space is plausible may
be seen as follows. In order to discuss the structure of space and time, we
will have to introduce the metric of the space. Let us assume that the
metric dl of (x, y, z) space is given by
«fl2 = gtfxidxf, (i, / = 1 , 2, 3), (*i, *2, *3) = (x, y, z).
It will be quite natural to assume that the distance dl, which a point in
uniform motion travels in time dt, is proportional to dt. If we assume this,
then gy must be constant. From this we can easily prove the euclidean
RELATIVITY AND EQUIVALENT OBSERVERS 331
property of the space. Then, we can introduce a cartesian coordinate
system, and can define scale. Clock can be defined by combining scale
and uniform motion.
In pre-relativistic theories, it is postulated that the running rate of a
clock is the same for all observers, independently of their state of motion.
Namely, the existence of an absolute time lapsing objectively is assumed.
We do not make such an assumption, since there is no compelling reason
for this. The running rate of the moving clock can be determined by (3),
namely by its state of motion and the nature of scale which is determined
by the euclidean nature of the space.
The axiomatic formulation of the special principle of relativity has
been the main problem of the foregoing discussions. Our papers were
attempts aimed at this end. Of course, we did not aim at rigorous axi-
omatization of the theory. Our interest was not in logical exactness but
was rather in knowing how to express the content of the special principle of
relativity. We believe that any attempt to axiomatizc special relativity
should start from analyzing the content of the special principle of rela-
tivity in all possible ways.
Our work reveals that uniform motion, normal frame and Minkowski
space-time are cyclically related and that logically there is no reason to
give priority to one of them. Therefore, either to assume the existence
of objects which undergo uniform motion first, or to assume Minkowski
space-time first, will be a kind of tautology.
If we want simplicity and rigor in the axiomatization of special rela-
tivity, then the existence of Minkowski space-time will have to be postu-
lated first. Or to postulate the constancy of light velocity first instead
of doing it last may be a simpler way than to specify the nature of space-
time first. Whichever way we choose, there remains a number of problems
to be considered in axiomatization of special relativity. Our work will
serve to solve one of these problems ; however, our work has the following
weak point. Namely, the weakest point of our paper lies in not drawing
any conclusion about how to specify the space-time structure. On the
other hand, because of this deficiency we are left with the freedom of
choosing a space-time structure. This is the next problem to be studied.
332 YOSHIO UENO
Bibliography
[1] GODEL, K., An example of a new type of cosmological solutions of Einstein's field
equations of gravitation. Reviews of Modern Physics, vol. 21 (1949), 447-450.
, A remark about the relationship between relativity and idealistic philosophy,
in SCHILPP, P. A. (ed.) Albert Einstein : Philosopher- Scientist. New York 1951, —
pp. 555-562.
GRUNBAUM, A., Das Zeitproblem. Archiv fiir Philosophic, vol. 7 (1957), pp.
165-208.
[2] NARIAI, H., On a new cosmological solution of Einstein's field equations of
gravitation. The Science Reports of the T6hoku University, Scr. I, vol. XXXV
(1951), pp. 62-67.
[3] UENO, Y. and H. TAKBNO, On equivalent observers. Progress of Theoretical
Physics, vol. 8 (1952), pp. 291-301.
[4] , On the equivalency for observers in the special theory of relativity. Progress
of Theoretical Physics, vol. 9 (1953), pp. 74-84.
[5] BONDI, H., Cosmology. Cambiidge 1952, 146 pp.
[6] EINSTEIN, A., The meaning of relativity. Princeton 1953, 25 pp.
Symposium on the Axiomatic Method
ON THE FOUNDATIONS OF QUANTUM MECHANICS 1
HERMAN RUBIN
University of Oregon, Eugene, Oregon, U.S.A.
\ . We shall consider several formulations of the foundations of quan-
tum mechanics, and some of the mathematical problems arising from
them. Various of these problems will be treated in greater or less detail.
Most of the results presented here are not new, and it is the purpose of
this paper mainly to bring to the attention of the worker in this field
some of the difficulties which they have blithely overlooked. Most of the
mathematicians dealing with the foundations of quantum mechanics have
concerned themselves mainly with Hilbert space problems ; one point they
have brought out is the distinction between pure and mixed states. We
shall not concern ourselves here with this problem, but shall confine our
attention to pure states.
We give three formulations in detail ; A, the Hilbert space formulation
with unitary transition operators, B, the matrix-transition-probability-
amplitude formulation, and C, the phase-space formulation. Each of these
formulations is adequate for quantum mechanics. In formulation A in the
classical case, the problem is usually specified by specification of the
Hamiltonian and then solved by means of the Schrodinger equation;
Feynman has proposed a method of path integrals which are not, as
claimed, the average over a stochastic process, and, while a similarity to
stochastic processes exists and should be exploited, does not mean that
theorems and methods applicable in stochastic processes automatically
apply. The same remarks apply to approach B, and a table is included of
some important differences between stochastic and quantum processes.
The identifiability problem is also pointed out for formulation B.
Formulation C is formally much closer to stochastic processes than A
or B, but important differences are apparent. First and most important,
the joint "density" of position and momentum need not be non-negative
or even intcgrable. This, it seems to the author, implies that not only are
position and momentum not simultaneously precisely measurable, but
1 Research partially supported by an OOR contract. Reproduction in whole or in
part is permitted for any purpose of the United States government.
333
334 HERMAN RUBIN
that they are not even simultaneously measurable at all. It is true thai
non-negativeness of the density is preserved, but even here the motion i<
not that of a stochastic process.
2. Let tff be a Hilbert space, £P a partially ordered set — which ir
the relativistic case could be thought of as the set of all space-like sur-
faces, and in the classical case all points of time. Suitable conditions
which will not be discussed here are to be imposed on 6f .
A. For all S, T e £f y S < T, there is a unitary operator UTS on Jtf
such that if R <S < T,
(1) UTR =
In the classical case
(2)
where H is the Hamiltonian, and the Hilbert space may be taken to be L$
over a Euclidean space of suitable dimensionality.
A central problem in quantum mechanics is specification of the Hilbert
space and unitary operators involved.
Let E and F be complete spectral decompositions of the identity.
Since for all x e 3% \ % = fdEx = fdFx, we have UTS % = ffdFUTsdEx,
integrated first over E. But this is just the formulation of matrix mecha-
nics. Thus if suitable regularity conditions are satisfied,
B. For all R, S, T e &, R < S < T, D, E, F complete spectral de-
compositions of the identity,
(3) dFT = I dArS(F, E)dES)
and
(4) dXTR(F, D) =fdlrs(F, E)dlSR(E, D).
One can reconstruct U from L
If the spectral decompositions are discrete, the integration becomes a
summation. Also, we have the following interpretation of A: the proba-
bility that an observation at "time" T will yield a result in a set 2£ given
that an observation at "time" 5 yields a result E is
(5)
This has been interpreted as analogous to a stochastic process. However,
ON THE FOUNDATIONS OF QUANTUM MECHANICS 335
the differences are quite apparent to one familiar with stochastic pro-
cesses, and are important. For a stochastic process, the analogues of (3)
and (4) are customarily taken as definitions. However, expression (5) is
replaced by
(6) ffas(F,E)dF.
3T
The analogue of approach A is not as immediate. ^ is to be replaced
by an L\ space over a finite measure space, wrhich can be abstractly
characterized. Then UTS becomes a positive linear operator on 3? to 3?
and (1) is satisfied. In addition, for some strictly positive function /i,
and all 5 and T, UTS/I = fi- Also we may frequently, but not always, in
the stationary classical case, write
(7) Urs = exp[(r-S)n
where V is called the infinitesimal generator of the semigroup U.
To see the differences clearly, let us consider the classical case where the
Hilbert space is /2, i.e., all sequences of real numbers with finite sums of
squares. Complex Hilbert space seems natural in quantum mechanics,
but since every Hilbert space is automatically a real Hilbert space, and
the analogy is better, wre could use the real case. However, the complex
case actually provides a closer analogy to a real stochastic process! If we
now take E — F to be the natural decomposition of /2, we may make the
following analogy with discrete-space stochastic process. Starred sections
refer only to stationary processes with linear "time".
Stochastic process Quantum mechanics
Markov matrix UTS Unitary matrix UTS
Transition probability UTSU Transition probability \UTSij\2
* Infinitesimal generator does not * Infinitesimal generator always ex-
always exist and is not always ists and is unique,
unique.
*In the regular case, the infinitesi- * Infinitesimal generator is a skew
mal generator has all row sums 0, Hermitian matrix,
and all nondiagonal elements non-
negative.
Ordering of £f irreversible. Ordering of <5^ reversible.
*Trivial if periodic. *Can be non-trivial and periodic.
From A, if the Hilbert space is explicitly an L% space, it may be
336 HERMAN RUBIN
possible to write for a dense set of functions
(8) UTS(x)=fKTS(u,v)x(v)dv,
where KTS is a unitary kernel. It may be possible, and indeed in the
classical case it is, to determine the T-derivative of K at T = 5. Suppose
KTS* is a unitary approximation to KTS, such that the jT-derivatives of
K and K* coincide at T = S. In the classical case, Feyman did this by
writing
(9) KTS*(u, v) = N(T - S)exp ^— ATS(u, v
where ATS(M> v) is the action along the classical path from v at "time" S
to u at "time" T. Then we may define UTS* from KTs* in a manner
analogous to (8). It may be that
n
UTS = Hm p UTiTt_lt* T0 = S, Tn = T, TVi < Tit
when the partition becomes fine. Although there are several treatments
in the literature, including some by prominent mathematicians, the
existence and value of this limit has not been proved. From the Schro-
dinger equation, one can prove the following
THEOREM: // there exists a basis of L% such that for each function x in
the basis, the second derivatives of UTS% has a uniformly integrable Fourier
n
transform, then UTS — lim YI ^T,T<-I>* wneYe TQ = S, Tn — T, T^-i < T^,
and the partition becomes fine.
It seems likely that this result can be considerably extended.
If we examine the analytic form of (9) , we find that it resembles that of
a diffusion process. However, the "variance" of the "diffusion process"
would have to be purely imaginary. Furthermore, there are even periodic
models in quantum mechanics which satisfy the theorem above. If T—S
is a multiple of the period, KTS cannot be a function in the ordinary
sense. In fact, if T—S is a multiple of any discrete spectral value, this
difficulty arises.
Another difficulty with this formulation is the statement that in the
limit KTS is the normalized mean value on x of exp(A(u, v, x)) where x is
a path with end points v at S and u at T. In the case of a diffusion process,
it is well known that the corresponding exponent is infinite with proba-
ON THE FOUNDATIONS OF QUANTUM MECHANICS 337
bility one. The same difficulty has already been noted in the quantum-
mechanical formulation.
The computation of the Feynman expression also is rather difficult
to evaluate. However, stochastic process methods may be useful. While
the process has purely imaginary variance, we may compute the diffusion
process with real variance and use analytic continuation. Again, it re-
mains to be proved that this method is correct. An intermediate approach
would be to apply analytic continuation to the coefficient of the kinetic
energy term alone. This last method has worked for the free particle
and the harmonic oscillator, and methods for computing the results in
general have been given by Kac.
One merit of the Feynman approach is that it has great possibility of
generalization in that it leads to a specific result for UTS, the specification
of which is a main problem of quantum mechanics and usually over-
looked by mathematicians dealing with the subject.
There is an outstanding question which arises from the empirical
standpoint; namely, if the model is correct, how much of the model can
be determined by even an infinite number of observations? This seems to
be most clearly brought out in formulation B above. For simplicity, let
us assume that the decompositions E and F are discrete. Then the
observable quantities are \A.TSij\2- Clearly these are not always adequate
for fixed E and F even if 5 and T are arbitrary.
In the discrete case, tosij — (UTS/I, ?j)- If we may vary E arbitratily,
we may determine UTS/I completely apart from a constant of absolute
value 1 for each i. If furthermore E — F and for almost all 5, T^UTSIJ^^
for all / and /, we can determine UTSW apart from a constant of absolute
value / independent of i and /, i.e., apart from a gauge transformation.
Another approach is the statistical approach of Moyal. This approach,
originally due to Wigner, is to investigate the joint "distribution" of
position and momentum. First, suppose a finite number A\, . . ., An of
Hermitian operators are given. Then if they have a joint distribution, its
characteristic function is £(exp 2 itjAj). However, the operator inside
the expectation is a unitary operator, and consequently the expectation
in question exists.
Therefore we should be able to determine the distribution from the
expectation. For example, let A\, A^ and A3 be the spin operators for
an electron in a hydrogen atom about which nothing has been deduced by
experimentation about the spin. Then £(exp ^itjAj) = cos --
£
338 HERMAN RUBIN
which is certainly not the characteristic function of any distribution. Let
us proceed as if this difficulty does not arise, and let us treat the case of
position and momentum. We obtain the characteristic function
(10) £(exp (*«p + ifiq)) =/>*(?- &*)tf*y(q +
and the corresponding density
(1 1) t(t>, 9)=~
Another example of the misbehavior of / is in order. Let us consider a
plane wave passing through a slit of operture 2a. Then y(x) = ------ ,
— a < % < a, and we obtain v a
(12) f(p, q) =
1 2(a — \q\\p
-sin-v- -^UP \q\<a,
2nap Ti
0 \q\ > a.
We clearly see that / is not non-negative, and not even Lebesgue in-
tegrable.
It would be desirable to have an abstract characterization of all
permissible "densities", as the density is adequate both for the kinematics
and for the dynamics of quantum mechanics. Let us proceed to do so. As
to the kinematics, it follows from (11) that for almost all x, y,
(13) y(*)v*(.v) = jf(p, ---J--
Therefore
04) [ I / (p, X + y ) I (n,
= ( J / (p, -——\ f (n, — — ) eM*-»W-yWh dpdn
for almost all x, y, z, w. If, in addition, //(/>, x)dp is a probability density,
there will be a unique solution for y) apart from a factor of absolute
value 1. Conversely, if / satisfies (14) and ff(p, x)dp is a probability
density, the ^ deduced from / by (13) yields / in return.
Concerning the dynamics of the process, Moyal has shown that the
ON THE FOUNDATIONS OF QUANTUM MECHANICS 339
temporal derivative of the characteristic function (10) is, where H is the
classical Hamiltonian,
(15) —
— H(p — ±hp, q
Inverting this, we obtain for the derivative of the density
(16) -^M
J J W*-**'*6 (- f (?-?')- -| (P-P'))f(P', q',
where «/(w + iv) = v, and H denotes the Fourier transform of H. A more
convenient form of (16) is
(17)
dt
i r c .
— Jfta, q +
J J
(]8) ^M = |sin|.M ' _ _JL _L ]w,
Even this form gives some difficulties in evaluation because of the non-
existence in the usual sense of ft, and the right-hand side of (16) has to
be evaluated by approximations. The form which Moyal seems to prefer
is even worse in this respect, but it also has some advantages.
M).
[This latter form shows more clearly the relationship between classical
and quantum mechanics, but the differential operator on the right is of
infinite order and analytic difficulties may clearly ensue. In the case in
which H is a polynomial of degree at most 2, (18) reduces to the classical
equations of motion; quantum-mechanical considerations come in only
through restrictions (14) on /.]
In any case, it follows that the dynamics of the phase-space repre-
sentation above does not further involve the wave function. Consequently,
the dynamics of y is determined up to a gauge transformation by equa-
tion (17), and hence the following formulation is adequate for classical
one-dimensional quantum mechanics :
340 HERMAN RUBIN
C. There is a function f of three arguments satisfying almost everywhere
for some value t of its third argument, (14) and f f(p, x, t}dp is a probability
density, and satisfying (17).
It is clear how to extend this to higher dimensional cases.
This "probabilistic" procedure might also be used to construct the
unitary kernel KST for the Feynman approach, although this has not
been done.
Bibliography
[1] FEYNMAN, R. P., Space-time approach to nonrelativistic quantum mechanics.
Review of Modern Physics, vol. 20 (1948), p. 367.
[2] GELFAND, I. M. and A. M. YAGLOM. Integration in function spaces and its
application to quatum physics. Uspekhi Matematicheskikh Nauk (N.S.), vol.
11 (1956), p. 77.
[3] KAC, M., On some connections between probability theory and differential and
integral equations. Proceedings of the Second Berkeley Symposium on Mathe-
matical Statistics and Probability, University of California, Berkeley 1951.
[4] MONTKOLL, E. W., Markoff chains, Wiener integrals, and quantum theory.
Communications on Pure and Applied Mathematics, vol. 5 (1952), p. 415.
[5] MORETTE, C., On the definition and approximation of Feynman's path integrals.
Physical Review, vol. 81 (1951), p. 848.
[6] MOYAL, J. E., Quantum mechanics as a statistical theory. Proceedings of the
Cambridge Philosophical Society, vol. 45 (1949), p. 99.
[7] SEGAL, I. E., Postulates for general quantum mechanics. Annals of Mathematics
(2), vol. 48 (1947), p. 930.
[8] STONE, M. H., Notes on integration /, 77, ///, IV. Proceedings of the National
Academy of Sciences, U.S.A., vol. 34 (1948), p. 336, p. 447, p. 483, vol. 35
(1949), p. 50.
Symposium on the Axiomatic Method
THE MATHEMATICAL MEANING OF OPERATIONALISM
IN QUANTUM MECHANICS
I. E. SEGAL
University of Chicago, Chicago, Illinois, U.S.A.
1 . Introduction. An operational treatment may be described as one that
deals exclusively with observables; but the latter term is physically as
well as mathematically somewhat ambiguous. Our aim here is to circum-
scribe this ambiguity by axioms for the observables that will be satis-
factory as far as they go, but by no means categorical. On the other hand,
it will turn out that it is not too far from such axioms to plans for a
categorical model representing the field of all elementary particles.
The need to consider so broad a system arises in several ways. For one
thing, no axiom system is secure if it does not treat a closed system, and
except substantially in the case of classical quantum mechanics (by which
we mean the non-relativistic quantum mechanics of a finite number of
degrees of freedom), there is no mathematical or physical assurance that
the systems conventionally considered are really closed. In fact the
evidence, — highly inconclusive as it may be, — points very much in the
other direction. For another, although the mathematical foundations of
classical quantum mechanics are in a relatively satisfactory state from at
least a technical point of view (the theory is consistent, within obvious
limits categorical, and realistic), time and energy play crucial but puzzling
roles, as observables unlike the others. While this remains true in rela-
tivistic quantum field theory, for different reasons, it seems fair to say
that one of the accepted informal axioms of the theory is that it must
ultimately contain the solution to the puzzle, if such exists.
We should not gloss over the question of just what is a quantum field
theory, — in fact, this is the main question we wish to examine here. It is
a difficult question, since at present what we have, after thirty years of
intensive effort, is a collection of partially heuristic technical develop-
ments in search of a theory; but it is a natural one to examine axioma-
matically. Present practice is largely implicitly axiomatic, and nothing
341
342 I. E. SEGAL
resembling a mathematically viable explicit constructive approach has
yet been developed. In any event a constructive approach must pre-
sumable describe the physical particles with which an operational theory
must deal in terms of the only remotely operational bare particles, a
problem that is relatively involved in the current non-rigorous treatments,
and needs to be clarified by a suitable axiomatic formulation.
Description of a field, whether classical or quantum, involves analyti-
cally three elements: (a) its phenomenology, i.e. the statement of what
mathematically are the observables of the field, and what are their
physical interpretations, — including especially, in the case of quantum
fields, the statistics, i.e. the observables called single-particle occupation
numbers, which do not exist in classical fields, and form the basis for the
particle interpretation of quantum fields; (b) its kinematics, i.e. the
transformation properties of the field observables under the fundamental
symmetry group of the system; (c) its dynamics, or the 'temporal' de-
velopment of the field, where however the 'dynamical time' involved
must be distinguished from the 'kinematical time' involved in (b). The
dynamics results from the interaction between the particles constituting
the field, and is in fact its only observable manifestation, while the ki-
nematics has nothing to do with this interaction.
The present state of the axiomatics of these elements and of the
desiderata relevant for further developments is discussed from a jointly
mathematical and operational viewpoint in the following.
2. Phenomenology. This is the best-developed of the relevant phases
of quantum mechanics from both a mathematical and an operational
point of view. One knows that the bounded observables, which are the
only ones that can in principle be measured directly, form a variety of
algebra, of which the self-adjoint elements of a uniformly closed self-
adjoint algebra of operators on a Hilbert space (C*-algebra) is virtually
the exclusive practical prototype. One knows also that the states of the
system are represented by normalized positive linear functional on the
algebra, the value of such a functional on an element being what is
conventionally called the 'expectation value of the observable in the
state' in physics, but there being no operational distinction between the
state and the associated functional, — i.e. operationally (and in our usage
in the following) a state is precisely such a functional. In these terms the
OPERATIONALISM IN QUANTUM MECHANICS 343
essential notions of pure state, spectral value of an observable, probability
distribution of an observable in a state, etc., can be axiomatized and shown
to admit a mathematical development adequate for physical needs.
An important conclusion of the theory is that a physical system is
completely specified operationally by giving the abstract algebra formed
by the bounded observables of the system, i.e. the rules for forming linear
combinations of and squaring observables. In particular, operationally
isomorphic algebras of observables that are represented by concrete C*-
algebras on Hilbert spaces, do not at all need to be unitarily equivalent,
even when, for example, they are both irreducible. The irrelevant and
impractical requirement of unitary equivalence is in fact the origin of
serious difficulties in the development of quantum field theory, a point
with which we shall deal more explicitly later.
The subsumption of quantum fields under general phenomenology
involves the formulation and treatment of the 'canonical field variables'
and the 'occupation numbers'. Traditionally the former were an ordered
set of symbols pi, p2, • • • and q\9 q%, ... satisfying the commutation
relations that had been so successful in classical quantum mechanics.
(This is for 'Bose- Einstein' fields; relevant also are 'Fermi-Dirac' fields,
but as these involve no great essential novelty as far as the present
aspects of axiomatics go, the present article treats only the Bose-Einstein
case.) It was assumed that these were an irreducible set of self-adjoint
operators, and that any two such systems were equivalent; upon this
informal axiomatic basis the theory rested. But from the very beginning
the success of quantum field theory was attented by 'infinities' in even
the simplest cases, and more recently it has been found that there exist
at least continuum many inequivalent irreducible systems of canonical
variables. Such troubles made it uncertain whether the phenomenological
structure described above was strictly applicable in the case of quantum
fields, or at least whether the canonical variables really were self-adjoint
operators in a Hilbert space. The proper sophistication, based on a
mixture of operational and mathematical considerations, gives however a
unique and transparent formulation within the framework of the phe-
nomenology described; the canonical variables are fundamentally
elements in an abstract algebra of observables, and it is only relative to a
particular state of this algebra that they become operators in Hilbert
space.
344 I. E. SEGAL
In a formal way it was easily seen that the symbolic operator
(pk + iqk}(pk — iqk) had integral proper values (i2 = — 1), and for this
and related reasons could be interpreted as 'the number of particles in the
field in the &th state', which is essentially what puts the 'quantum' into
'quantum field theory', by giving it a particle interpretation. Those
particles, the 'quanta' of the field, have generally been presumed to be
'represented' by the vectors in a linear space, proportional vectors being
identified. This linear space L does not have direct operational sig-
nificance, since what is more-or-less directly observed are the 'occupation
numbers of single-particle states', i.e. the observables just defined
(formally). But the general principle that there exists (theoretically) a
single-particle space L, spanned by an infinite set of vectors /i, /2, . . . ,
and such that pk + iqk can represent in a certain sense the creation of a
particle with 'wave function' ejc, and the operator defined above the
total number of such particles in the field, has attained virtually as well-
established a position as the general phenomenological principles de-
scribed earlier. The great empirical success of relativistic quantum electro-
dynamics, in which the photon and the electron are represented by
suitably normalizable solutions of Maxwell's and Dirac's equation,
respectively, provides, among other developments a basis for this princi-
ple, and indicates also that L should admit a distinguished positive-
definite Hermitian form, which determines, e.g., when two particles are
empirically similar. It is conservative as well as useful in treating certain
theories of recent origin to assume only a distinguished topological
structure that may be induced by such a form, which turns out to involve
no really significant weakening of the foundations, and ultimately to
clarify their logical structure. In fact, partly for logico-mathematical
reasons, and partly with a view to deriving ultimately the relevance of
complex scalars for the single-particle space from invariance under so-
called particle-anti-particle conjugation, it is appropriate to assume
initially that the single-particle structure is given by an ordered pair of
mutually dual, real-linear spaces with the topological structure described,
and with which the canonical £'s and q's aae respectively associated. A
distinguished admissible positive-definite inner product in one of these
spaces will give a distinguished complex Hilbert space structure on the
direct sum of the two spaces, but there are other ways in which this more
conventional structure may arise.
Taking then a conservative position, and defining a phenomenological
single-particle structure as an ordered pair of real-linear spaces (H, H'}
OPERATIONALISM IN QUANTUM MECHANICS 345
that are mutually dual in the sense that there is given a distinguished
non-singular bilinear form x.y'(x e H, y' e H'), a quantum field relative
to this structure may be rigorously, but provisionally, described as an
ordered pair of maps (p(.), q(.)) from H and H' respectively to the self-
adjoint operators on a complex Hilbert space K, satisfying the 'Weyl
relations' :
etp(x)eip(y) = eip(x+y)f eiq(x')eiq(y') _ eiq(x'+y')
eip(x)eiq(y') = eiX'y'eiq(y')etp(x)f
which are formally equivalent to the conventional commutation re-
lations, but mathematically more viable, in that difficulties associated
with unbounded operators such as the p's and ^'s themselves, are avoided.
This is merely an honest, if slightly sophisticated and general, mathe-
matical transcription from the ideas and practice of physical field theory,
but it is useful in providing a basis for deciding what is literally true
about quantum fields, and what is figurative or symbolic. Thus the
physical folk-theorem : 'Any two irreducible quantum fields are connected
by a unitary transformation' is literally false, although it has figurative
validity, which on the basis of a further mathematical development can
be made rigorously explicit. The needs of field dynamics leads to this
development and to a revision of the present provisional notion of quan-
tum field which will be indicated later.
Also in need of revision is the definition of occupation number of a
single-particle state. The validity of the occupation number interpretation
of the given operator depends in part on the representation of the total
field energy (etc.) in terms of occupation numbers of states of given
energy, in keeping with the idea that it should equal the sum of the pro-
ducts of the various possible single-particle energies with the numbers of
particles in the field having these energies. This holds for a certain
mathematically and physically distinguished quantum field in the fore-
going sense, studied by Fock and Cook, often called the 'free field',
although actually of dubious application to free incoming physical fields,
and almost certainly inapplicable to interacting fields. In any event, it
breaks down in the case of arbitrary fields, and there has been some un-
certainty as to whether a physically meaningful particle interpretation of
an arbitrary field could be given. The solution to this problem depends
on the proper integration of statistics with kinematics, to which we now
turn.
346 I. E. SEGAL
3. Kinematics. It is axiomatic that a suitable displacement of the
single-particle structure should effect a corresponding field displacement.
In the case of a classical field, given say by Maxwell's equations, it is
clear an arbitrary Lorentz transformation L induces a transformation
U(L) in the space of solutions. From a quantum-field-theoretic point of
view however, U(L) is merely a displacement in the single-particle space
(of normalizable photon states), and what is needed is a transformation
V(L) on the field vector state space K of the preceding section. The
assumption that V(L) exists means essentially that any admissible change
of frame in ordinary physical space should give a corresponding transfor-
mation on the field states. In addition, the assumed independence of
transition probability rates of elementary particle processes from the
local frame of reference has led to the further assumption that V(L) is a
projective unitary representation of the Lorentz group, in at least the case
of the 'free incoming' physical field.
In addition to the Lorentz group, there is a group of transformations in
the single-particle vector state space which plays an important part in
nuclear physics, and which do not arise from transformation in ordinary
physical space, — namely, transformations in isotopic spin space. In the
absence of precise knowledge, it is assumed that this group acts indepen-
dently of the Lorentz group, but its precise structure as an abstract
group is undecided, and it is quite uncertain whether it is rigorously true
that these transformations commute with the action of the Lorentz group
on the single-particle space. There is also the group of guage transfor-
mations, which is important in quantum electrodynamics, but does not
have any counterpart in most other elementary particle interactions.
The improper Lorentz transformations have recently been the subject
of intense interest. These transformations give rise to outer automorphisms
of the proper Lorentz group, and there seems to be at present no oper-
ational reason to doubt that this is their chief significance (rather than as
direct transformations in ordinary space-time), but the experimental
situation is far from giving any assurance that this is the case. In the case
of standard relativistic theory, this leaves only charge and particle-anti-
particle conjugation, of which the latter is connected with the equivalence
between particle and the contragredient anti-particle transformations,
and does not appear to represent in a natural way a group element.
Finally, these and other kinematical loose ends, together with the
dynamical divergences, have led certain scientists to investigate the
OPERATIONALISM IN QUANTUM MECHANICS 347
possibility that some other group may give more satisfactory results than
the Lorentz group, just as this group gave ultimately a sounder theory
than the Galilean group of Newtonian mechanics, and of which the Lo-
rentz group will be a type of degenerate form, just as the Galilean group is
a degenerate form of the Lorentz group.
On a conservative basis, it seems that about all that may legitimately
be assumed of a mathematically definite character is that there exists a
fundamental symmetry group G, which may reasonably be assumed to
be topological, and which acts linearly and continuously on the single-
particle vector state space. A priori it might appear that this is not suf-
ficient as a basis for an effective field kinematics, but it turns out that
special properties of G and of its action on the single-particle space are
not significant as regards the foundations of field kinematics. The main
desideratum is to establish the appropriate action of G on the field, and
this exists substantially in all cases, provided it is the operational action
that is considered. That is to say, the action of G on the state vectors of
the field, — which in the case of standard relativistic theory is given
formally in detail in the recent treatments of field theory in the literature,
— does not need to exist in a mathematical sense, any more than it exists
operationally; but the action of G on the field observables, which is
formally to transform them by its action on the state vectors, has effective
mathematical existence. However, to this end it is necessary to make the
revision of the notion of quantum field referred to above, to which one is
naturally led by dynamical and further operational consideration.
Before going into these matters, we mention that the generality of the
foregoing approach to kinematics permits the integration of the statistics
with the kinematics. Any non-singular continuous linear transformation
on the single-particle structure (//, H'} preserving the fundamental skew
form x.y' — u.v' (x and u arbitrary in H, y' and u' arbitrary in H'} acts
appropriately on the field observables; in particular certain- phase
transformations in the single-particle space so act, and the occupation
numbers are obtained as generators of one-parameter groups of such field
actions. A development of this type is needed for the particle interpre-
tation of fields, if one is to avoid the ad hoc assumption that the free
incoming physical field is mathematically represent able by the special
representation referred to earlier, as well as for dealing with the concept
of bound state.
348 I. E. SEGAL
4. Dynamics. In conventional theoretical physics, a dynamical transfor-
mation is represented by a unitary transformation mathematically. In
the case of an abstract algebra of observables as described above, it has
however no meaning to say that a transformation of this algebra is given
by a unitary transformation, for this may be true in certain concrete
representations of the algebra and not in others. It is clear though that
the transformation of the observables determined by a unitary operator in
a concrete representation is an automorphism of the algebra. Since
operationally an automorphism has all the relevant features of a dynamical
(or, for that matter, kinematical) transformation, one is led to a gener-
alization of conventional dynamics in which such a transformation is
axiomatized as an automorphism of the algebra of observables. This is a
proper generalization, in the sense that it is not always possible to re-
present an automorphism of an abstract C*-algcbra by a similarity
transformation by a unitary operator in a given concrete representation
space ; but what is more relevant to field theory is that even when each of
a set of automorphisms can be so represented, there will generally be no
one representation in which all of the automorphisms are so reprcsentable.
This difficulty docs not arise to any significant extent in the quantum
mechanics of a finite number of degrees of freedom, for due to a special
property of finite systems of canonical variables, every automorphism of
the conventionally associated algebra of observables can, in any concrete
representation, be induced by a unitary operator. But in the case of a
quantum field, there are simple apparent dynamical transformations that
can be shown to be not implementable by any unitary transformation in
the case of the Fock-Cook field. Now there is no physical reason why every
self-adjoint operator on the field vector state space should even in
principle be measurable, but it has not been clear how to distinguish, in
effective theoretical terms, those which were. To arrive at such a dis-
tinction, we consider that the canonical variables themselves should be
measurable, and also, in accordance with conventional usage in the case of
a finite number of degrees of freedom, any bounded 'function' of any finite
set of canonical variables. However, since only finitely many particles are
involved in real observations, other self-adjoint operators are only doubt-
fully measurable, except that uniform limits of such bounded functions
must also be measurable, since their expectation value in any state is
simply the limit of the expectation values of the approximating bounded
functions. That is to say, uniform approximation is operationally meaning-
OPERATIONALISM IN QUANTUM MECHANICS 349
ful, since operators are close in this sense if the maximum spectral value
of their difference is small. The point is now that the simple apparent
dynamical transformations that could not be represented by unitary
transformations in the field state space can however be represented by
automorphisms of the algebra of observables just arrived at (e.g. division
of the canonical p's by X > 1 and multiplication of the canonical q's by A
can be represented by such an automorphism, although not by a unitary
transformation in the Fock-Cook field).
More generally, the algebra of measurable field operators defined above
is the same for all concrete quantum fields as defined earlier. That is,
for any two quantum fields (p(.), </(.)) an<J (P'(-)> </'(•))> relative to the
same single-particle structure, there exists an isomorphism between the
corresponding algebras that takes any (say, bounded Baire) function
of p(x) into the same function of P'(x) for all x, and similarly for the ^'s.
This isomorphism is in fact unique, from which it can be deduced that any
continuous linear single-particle transformation leaving invariant the
fundamental skew form gives rise to a corresponding automorphism of the
algebra. This resolves the problem of defining the field kinematics when
the single-particle kinematics is given.
For an operational field dynamics we have to deal mainly (if not, indeed,
exclusively) with the particular transformation that connects the so-called
incoming and outgoing free fields, which may be defined as the scattering
automorphism. In view of the uniqueness of the algebra of field observables,
it does not matter in which representation this automorphism is given.
Tied up with these notions are those of the physical vacuum state,
physical particle canonical variables and occupation numbers, and the
scattering operator. Since what is more-or-less directly observed for
quantum field phenomena is interpretable as the scattering of an incoming
field of particles, it is appropriate to attempt to formulate these various
notions in terms of agi ven scattering automorphism Y . The physical vacuum
state must certainly satisfy the condition of invariance under s. This will
in general not give a unique state, but it is fairly reasonable to assume that
in a realistic theory, the additional requirement of invariance under the
kinematical action of a maximal abelian subgroup of the fundamental
symmetry group may well give uniqueness. The axiom of covariance
asserts that s commutes with the kinematical action of the entire symmetry
group on the field observables, and from this and a well-known fixed-point
theorem the existence of a physical vacuum as so defined follows.
350 I. E. SEGAL
Given a state of an abstract C*-algebra that is invariant under an
abelian group of automorphisms, there corresponds in a well-known
mathematical manner, a concrete representation of the algebra on a
complex Hilbert space K, and a unitary representation of the abelian
group on the space, which give similarity transformations effecting the
automorphisms. In this way there is determined the unitary scattering
operator 5, which in this particular representation implements the
automorphism s, and a unitary representation of the maximal abelian
subgroup of the covariance group. The vacuum state is represented by a
vector of K, left invariant by 5 and this unitary representation. (In the
application to standard relativistic theory, the abelian subgroup would
consist of translations in space-time, which in conventional theory leaves
only the physical vacuum fixed, among all physical states.) The incoming
field is defined as that given by the representation, and the outgoing field
as its transform under 5, both having the vector state space K ; to avoid
subtle and technical mathematical questions in this connection the
physically plausible assumption of continuity of the physical vacuum
expectation values of the Ael^(x^B and AeWv^B, at least when x and y'
range over finite-dimensional subspaces of the single-particle space, is
made, where A and B are fixed but arbitrary field observables. The
p(x) and q(y') that generate the homomorphic images of the one-para-
meter groups [eMv(x}: — oo < t < oo] and [etWv'*: — oo < t < oo] are
defined as the canonical variables of the free incoming physical field, and
their transforms under 5, those of the outfield. In defining single-particle
state occupation numbers, it is convenient to assume present a distin-
guished complex Hilbert space structure in the direct sum // + //'. For
any single-particle state vector x in // + //', there is then a unique
continuous one-parameter unitary group [U(t) : — oo < t < oo] taking x
into eux and leaving fixed the orthogonal complement of x. The corre-
sponding automorphisms of the algebra of field observables likewise form
a one-parameter group. In general they will not leave invariant the phy-
sical vacuum state, but again making physically plausible continuity and
boundedness assumptions, there will be obtained finally a corresponding
one-parameter group of linear transformations in K, which will have a *di-
agonalizable' generator, i.e. one similar (in general, via a non-unitary
transformation) to a self-adjoint operator. Although these occupation
numbers are not self -ad joint, they have the crucial properties of having
integral proper values; of being such that the total in-field energy,
momentum, etc. the sum of the products of all single-particle energies,
OPERATIONALISM IN QUANTUM MECHANICS 351
momenta, etc. with the occupation numbers of the corresponding states in
a formal, but partially rigorizable, manner; and of annihilating the
physical vacuum state vector.
The fundamental problem of quantum field dynamics from an overall
point of view is and always has been that of the so-called divergences. In
present terms, this is the problem of establishing the existence of the
scattering automorphism s, which must satisfy certain conditions, which
however can not be stated with mathematical precision, this lack of
precision being an inherent difficulty of the problem. That the present
approach may well be relevant to this problem may be seen in the follow-
ing way. The scattering automorphism may be given as an infinite product
integral, and the crucial difficulty has always been that of establishing the
existence of the integrand. This is given formally by a complex exponential
of the integral at a particular time of the 'interaction Hamiltonian', whose
character is relevant here only to the extent that in a variety of interetsing
and typical cases, it is a linear expression in the canonical p's and ^'s,
whose coefficients are relatively un troublesome operators. E.g. in certain
current theories of meson-nucleon interaction, they are simply finite-
dimensional matrices; for fully quantized electrodynamics 'in a box' they
are mutually commutative self-adjoint operators in a Hilbert space. Now
there is no doubt that these formal operators are divergent, in the sense
that they do not represent bona fide self-adjoint operators in the Fock-
Cook representation, — in fact their domains in general appear to consist
only of {0}. But in dealing with these formal operators, we are at liberty
to change the representation employed for the canonical p's and q's
according to the foregoing development. Now it can be shown that there
always exists a representation for which Z^ q^ X T^ represents in an
obvious manner a bona fide hcrmitian operator, provided that each
Tjc is a bounded operator. One can deal similarly with E^ (qjeXTje+pjcX
X Vk) when the Tjc and Vk are mutually commutative self -adjoint
operators. In either case the complex exponential will be a well-defined
unitary operator. Thus although the final physical results are independent
of the representations employed in setting up the theory, the divergence
or convergence, as operators in Hilbert space, of expressions involved in
the analysis, may depend strongly on the representation. l
1 For a more detailed account of certain physical points, as well as references to
proofs of relevant mathematical results, see Segal [2] and [3]. For another approach
to the axiomatics of quantum field theory from a partially heuristic point of view,
but with points of contact with the present approach, see R. Haag [1].
352 I. E. SEGAL
Bibliography
[1] HAAG, R., On quantum field theories. Matcmatisk-Fysiske Mcddelelser udgivct
af del Kgl. Danske Videnskabernes Selskab. 29 (1955).
[2] SEGAL, I.E., The mathematical formulation of the measurable symbols of quantum
field theory and its implications for the structure of free elementary particles. To
appear in the report of the International Conference on the Mathematical
Problems of Quantum Field Theory (Lille, 1957).
[3] , Foundations of the theory of dynamical systems of infinite sly many degrees
of freedom. I. Matematisk-Fysiske Meddelelser udgivet af det Kgl. Danske
Videnskabernes Selskab. 31 (1959).
Symposium on the Axiomatic Method
QUANTUM THEORY FROM NON-QUANTAL POSTULATES
ALFRED LAND£
Ohio State University, Columbus, Ohio, U.S. A .
1. Physical and Ideological Background. Theoretical physics aims at
deducing formal relations between observed data by the combination of
simple and general empirical propositions which, if true, will 'explain' the
variety of phenomena. In the process of constructing a physical theory on
a postulational basis one may distinguish between three steps. First, by
critical evaluation of experience one arrives at ideological pictures for the
connection of individual data (e.g. for the 'path' of a firefly, Margenau)
and at general notions expressed in everyday language which takes much
for granted and may involve circularity in the definition of terms. Second,
the resulting picture is formalized and condensed into general laws.
Third, the formal laws are now put in correspondence with a physical
'model' which gives an operational definition of each symbol, resulting in
a self-consistent physical theory. In spite of its vagueness, step 1 is of
importance to the physicist since it furnishes a legitimate basis for his
selection of one formalism among many possible ones as the formal sub-
structure of his laws.
The quantum theory in its historical development has followed this
procedure, its laws are based today on a few universal, though rather
baffling, principles, the most prominent among them being those of
wave-particle duality, qp-uncertainty , and complementarity. I submit,
however, that the process of reduction has not gone far enough, and
that the quantum principles just mentioned can be reduced further to
simple empirical propositions of a non-qitantal character, the combination
of which yields the quantum principles as consequences. The latter can
thus be 'explained' on an elementary and more or less familiar back-
ground "so that our curiosity will rest" (Percy Bridgman),. Conforming
with step 1 above, I begin with considerations of a somewhat vague
character in order to lay the ideological groundwork for the formal
substructure of quantum mechanics. — Two objects, A and B, or two
'states" A and B of the same 'kind' of object, may be said to be different,
written A ^ B, when A and B are discernible, i.e. separable by means of
353
354 ALFRED LANDE
some device, shortly denoted as a 'filter', responding to B with 'no' when
B ^ A, and with 'yes' when B — A, as depicted by Figs. \a and Ib where
A is written for different from A or non-A. The term 'state', 'filter', 'kind'
of system (atom) are introduced without operational definition; they
happen to correspond to actual situations in microphysical experiments,
however.
- A
Fig. la Fig. Ib Fig. \c
As an illustration, A may signify a state of vertical orientation of the
molecular axis of a certain kind of particle, and the A -filter may be a
screen with a vertical slit. State A may be a state of horizontal orientation
of the same particle, so that the A -filter blocks A -state particles.
Imagine now that, starting from a state R ^ A (Fig. Ib) one gradually
'changes' state B so that it becomes 'more similar' to A (again no oper-
ational definition of the terms in quotation marks is given). One may
expect a priori that an abrupt change from Fig. \b to la will take place
only in the last moment when B becomes exactly equal to A . The postu-
late of continuity of cause and effect requires, however, that a gradual
change from B ^ A to B = A as cause will lead to a gradual change of
effect, from all B's blocked to all B's passed by the A -filter. More precisely,
the continuity postulate requires that there be intermediate states B
between B ^ A and B = A , with results intermediate between Fig. 1 b
and la, that is, with some B's passing and some rejected, as pictured in
Fig. \c\ such cases then signify a 'fractional equality' between B and A,
written B ^ A . The ratio between passed and repelled /Estate particles
can only be a statistical ratio, i.e. a probability ratio for an individual B-
state particle. Individual indeterminacy controlled by statistical ratios is a
consequence of the continuity postulate for cause and effect. The passing
fraction written P(B, A) of Z?-state particles through the A -passing filter
may be taken as an operational definition of the fractional equality degree
between the states A and B, of value between 0 and 1 . And since equality
degrees ought to be mutual, one will introduce the symmetry postulate,
P(A,B) — P(B,A)'t the latter is physically justified as the statistical
counterpart of the reversibility of deterministic processes. It stipulates
that the statistical fraction of #-state particles passed by an A -passing
QUANTUM THEORY FROM NON-QUANTAL POSTULATES 355
liter equals the statistical fraction of A -state particles passing a ^-filter.
Similar considerations apply to any game of chance with the alter-
tative 'yes' or 'no', passed or blocked, right of left, etc. For example,
vhen balls are dropped from a chute upon a knife edge, they will drop to
he right or to the left, depending on the aim of the chute.
According to the continuity postulate, however, there ought to be a
ontinuity of cases between all balls to the right and all to the left,
iccurring within a small range of physical aim, with statistically ruled
atios of r- and /-balls, gradually changing from 1 00 : 0 to 0 : 1 00 when the
physically regulated aim of the chute is changed from one to the other end
i the small angular range. Hypothetical reservaetions about concealed
auses for individual r- and /-events would never explain the miracle of
statistical cooperation' of individual events yielding fixed statistical
atios [1], [2].
Next we introduce the empirical postulate of reprodiicibility of a test
esult which stipulates that a #-state particle in Fig. \c which has once
>assed the A -filter will pass an ^4 -filter again with certainty. This harmless
ooking postulate implies that the incident Z?-state particle, in the first
.ct of passing the A -filter, must have changed its state from B to A.
ndeed, only thus will it pass another A -filter again with certainty.
Similarly, an incident /?-state particle once repelled by the A -filter must
lave jumped, by virtue of its first repulsion, from B to the new state A
o that it will be repelled again if tested once more by the A -filter. Dis-
ontinuous changes of state (transitions, jumps) in reaction to a testing
nst rumen t can thus be seen as consequences of the postulate of repro-
'ucibility of a test result and continuity of cause and effect. To these
>ostulates we have added that of symmetry, P(A, B) =P(B,A), in
/hich P now assumes the meaning of a transition probability from state B
o A in an yl-filtertest, and from A to B in a /Milter test.
2. The Probability Schema. After these ideological preparations we
ome to the mathematical schema of the probabilities of transition.
Consider a class of entities S (= 'states' of a given atom) which are in a
nutual relation of 'fractional equality' Sm ~ Sn, quantitatively de-
cribed by positive fractional numbers, P(Sm, Sn), denoted as 'equality
factions'. Special cases are P = 0 (separability, total inequality of Sm
nd Sn) and P = 1 (identity, inseparability). The P-relations permit a
Division of the elements of class 5 into subclasses, the subclass A with
lembers A \A z . . . which satisfy the orthogonality relation
356 ALFRED LANDE
(1) P(AmAm-) =dmm>,
the subclass B, and C, and so forth. (The selection of complete orthogonal
subclasses out of the entirety of entities 5 is not unique, a fact known to
the quantum theorist as 'degeneracy').
P-values connecting the elements of two subclasses such as A and B
may be arranged in a matrix:
P(Ai,Bi) P(Ai,B2)
P(A*,Bi)
(2)
The physical interpretation of the P's as probabilities of transition in
tests justifies the postulate that the sum of the transition probabilities
from any one state Am to the various states BiB%. . . be unity, i.e. that
each row of the matrix (2) sums up to unity. Furthermore, according to
the symmetry postulate
(3) P(Am,Bn)=P(BH,Am),
the columns of the matrix (PAB) are the rows of the matrix (PBA) so that
the columns of (PAB) also have sum unity;
{3') Zn P(Am ,Bn) = 1 and Zm P(Am, B n) = 1.
Suppose now that the matrix (2) has M rows and N columns. The sum of
all its elements would then be M when summing the rows, and N when
summing the columns. Thus M — N, that is, the matrices (PAB) and
(PAC) etc. must be quadratic, and the subclasses A , B, C, ... must all have
the same multiplicity, M. The multiplicity M of the orthogonal sets of
states may be finite or infinite depending on the 'kind' of particle. The
P-matrices are unit magic squares.
3. The Probability Metric. We now introduce the further postulate that
the various P-matrices are interdependent by virtue of a general law
according to which one matrix (P) in a group is determined by the other
matrices (P) of the same group. Only the following simple interdependence
laws between two-index quantities are feasible :
(4) the addition law UAC = UAB + UBC
made self-consistent by UAB = — UBA
and corresponding laws for distorted quantities W = f(U), e.g. for W = eu.
QUANTUM THEORY FROM NON-QUANTAL POSTULATES 357
(5) the multiplication law WAC = WAB - WBC
made self-consistent by WAB • WBA = 1
There is no other conceivable way of making UAC or WAC independent
of the choice of the intermediate entity B than the addition theorem (4)
and its generalization by distortion.
A model of (4) is furnished by the geometry of lengths LAB, LAC, etc.
in frameworks connecting points A , B, C, . . . . Although (4) cannot be
applied to the lengths L themselves, it may be applied to a substructure of
quantities 9? satisfying the triangular relation VAC = <PAB + <PBC with
<PAB — — <PBA> known as vectors. The latter determine the lengths L = \<p\.
Of particular interest is plane geometry where vectors <p can be written as
complex symbols, <p = \<p\ .ei0i. Also in a plane, 5 points are connected by
10 lengths; when 9 of them are given they uniquely determine the
tenth L.
In order to construct a law of interdependence between unit magic
squares one may start from (5), Although (5) cannot be applied to the
matrices (P) themselves, it may be applied to a substructure of quantites
V which are to satisfy the matrix multiplication formula
(6) (VAC) = (VAB)-(VBC), with (yAA) = (VAB)-(VBA) = (0-
When now decreeing (the asterisk standing for the complex conjugate) :
(7) v(A*9 Bn) = v*(Bn, At) and P = |y|2,
the P-matrices become unit magic squares, as required. (6) is known as
the law of unitary transformation, connecting 'orthogonal axes systems'
A and B etc. by 'complex directional cosines' y>. A tensor / in general
obeys the transformation formula
(8) (fAD) = (vAB).(fBc).(ycD).
To the physicist, the quantities y; are the 'probability amplitudes' which
satisfy the law of interference (6), and the tensors / are 'observables'. When
/ has its eigenvalues in the states FiF% . . . that is, when
(9) f(Fn,Fn.)=f(Fn).dnn',
then, as a special case of (8), one has
(9') f(At, A,) = Zn V(Ak, Fn) .f(Fn).V(Fn, At).
The y-interference law and the corresponding transformation law for
358 ALFRED LAND£
observables was first found inductively and was considered as a most
surprising empirical law of nature. It turns out to be the only conceivable
solution of the mathematical problem of finding a general self-consistent
law connecting unit magic squares, viz. the law of unitary transformation.
In opposition to numerous physicists who see in the interference law
for complex probability amplitudes a profound and unfathomable plan of
nature presenting us with an abstract and unpictorial substructure of
reality manifest in a wave-particle duality, it may be noticed that
(a) each complex y> may be pictured as a vector in a plane giving direction
to the corresponding probability P so that the P-metric can be visualized
as a structural framework of lines in a plane, and (b) similar to plane
geometry where 5 points A, B, C, D, E are connected by 10 lengths LAB,
LAC, etc. and 9 L's uniquely determine the tenth L, so are there direct
relations between the 10 unit magic square matrices (PAB), (PAC), etc.
which connect 5 orthogonal sets of states so that 9 P-matrices uniquely
determine the tenth. That is, there are direct relations between the real
probabilities P which can be formulated without resorting to complex
quantities y> with wave-like phase angles.
4. Quantum Periodicity Rules. The quantum theorems of Born and
Schrodinger
(10) (qp — pq) = h\2in and p = (h/2in)d/dq
are equivalent to the rule that the amplitude function yj(q, p) is a complex
exponential function
(11) \p(q, p} — exp(iqp/const)
with const = h/2ji. The quantum rules (10) or (11) are usually introduced
ad hoc as inductive results of quantum experience. I am going to show
that they are consequences of the following postulates added to those
introduced before :
a) Linear coordinates q and linear momenta p are physically defined
up to additional constants so that there are observables whose
values depend on q-differences and on p-differences only.
b) The statistical density of conjugates q and p is constant in ^/>-space
(as it is in classical statistical mechanics).
The proof of (11) on the grounds of (a) (b) rests on the fact that the
complex exponential function, f(x) = exp(z#/const) is the only function
f(x) which, together with its complex conjugate f*(x), satisfies the condi-
QUANTUM THEORY FROM NON-QUANTAL POSTULATES 359
tion that the product f(xi) .f*(xz) will depend on the difference x\ — x%
only.
The detailed proof runs as follows. As a special case of (9') for an
observable / defined as a function of q one has
t(P*> Pi) = Zn V(pk, qn)f(qn)V'(qn, Pj)-
If q is a linear coordinate running continuously from — oo to + oo, and
for given ^-values has constant |y>|2 density, the last formula becomes an
integral with constant weight factor in the integrand:
( 1 2) t(t>*. Pi) = / V(P*. q)f(9)v(9, Pi)dq
Since f(q) may be any observable whatsoever, one may consider the case
that it is a <3-function with maximum at any chosen place qi ; the integral
then reduces to
t(Pk> Pi) = V>(Pk, qi)V*(Pi> qi).
If the 'transition value1 f(pjc, pj) is to depend on the difference pk — PJ
only, the function y> on the right must contain p in the form
(13) V(q,p) = exp(... *>...).
An analogous consideration applied to an observable g(p) which may be
chosen as a ^-function yields the result that the function *P must contain
q in the form
(13') V(q,p) = exp(... *?...).
(13) and (13') together leave only the following alternative: Either
y(q, p) is of the form
y>(q, p) = cxp(ociq + pip)
with separate real constant factors a and /?, or
(14) y(q,p) = ^xp(iyqp)
with common real factor y. The first alternative would lead, according
to (12) to
(#* - pj)]ff(q).dq = exp(ta(#jt - Pi)]- const,
where the left hand side depends on the choice of the function /, whereas
the right hand side does not. Only the second alternative makes sense.
When writing h/2n for y Eq. (14) it is identical with (11), q.e.d. Eq. (1 1) is
360 ALFRED LAND£
the fundamental wave function of quantum dynamics with wave length
A = h/p.
For completeness sake we add the well-known deduction of the symme-
try theorems which are of such decisive importance for the aggregation
of identical particles. Identity of two particles a and b signifies their
indiscernibility and in particular equality of the two transition probabilities
or omitting reference to S :
2 =
This equation can be satisfied only when ip is either symmetric or anti-
symmetric with respect to an exchange of the letters a and b, proved
as follows. Write
it bf) =
=
Similarly
y(bif aj) = &ym(«, b) - <£anfc(«, b)
Taking the absolute squares of the two last equations one arrives at
P(ai, bj) - l^syml2 + l^antl2 + real part of (2<£sym<£ant*)
P(bt, aj) = same real part of same
The two P's can be equal only when either <f>8ym or <f>ant vanishes, i.e.
(excluding the trivial case of y == 0) when either y = <t>ant or \p = faym*
q.e.d.
For systems of three or more identical particles y(a, b, c, . . . ) must
either be symmetric with respect of the exchange of each pair, or anti-
symmetric. Indeed, if y were symmetric with respect to a and b, but
antisymmetric with respect to a and c, one would arrive at the following
sequence :
+ y(a, 6, c) = + y(b, a, c) = — y)(b, c, a) = — y(a, c, b) =
= + V>(c> a> ty — + V(c> b,a) = — y(a, b, c)
which is self-contradictory. All particles are thus divided in two classes,
QUANTUM THEORY FROM NON-QUANTAL POSTULATES 361
those which form symmetric, and those which form antisymmetric y-
functions.
This concludes the deduction of the quantum theorems from basic
postulates of a non-quantal character.
5. Quantum Fact and Fiction. A few remarks may be added concerning
the present quantum philosophy, reputedly the most revolutionary
innovation in the theory of knowledge of the century. Its starting point is
the allegation that quantum theory has invalidated the notion of objective
states possessed by a microphysical system independent of an observer
(according to some authorities) or independent of a measuring instrument
(according to others). And the quantity y is said to have a particularly
' subjective' character in so far as it expresses expectations of an observer,
rather than states of an atom, y is also reputed to be 'abstract' and
'unanschaulich' (unpictorial) due to its complex-imaginary form.
In the writers opinion, this quantum philosophy rests on various
misunderstandings and fictions. First, complex quantities stand for
vectors in a plane ; hence y> gives direction to the transition probabilities
so that the latter form a structural framework in a plane. The ^-multi-
plication law (6) is quite analogous to the geometrical vector addition
law VAC = <f>AB + VBC- But nobody has yet found plane geometry
abstract and unpictorial because it connects real lengths by vectors
which could be symbolized by complex numbers.
Second, since a test resulting in the state A m of an atom is reproducible
by means of the same A -meter, one may legitimately denote the state
Am as being 'objectively possessed' by the atom. It is true that a sub-
sequent ZMest throws the atom into a new (equally reproducible) state
Bn. Thus one does not have the right to say, or even to imagine, that the
atom is in the two states Am and Bn simultaneously; the two states are
'incompatible'. But incompatibility as such is nothing novel and revo-
lutionary. A state of angular twist value w of a rod of ice, and a viscosity
value v of the same sample in the liquid state are mutually incompatible ;
there are no combination w-states. It is significant of quantum dynamics
that a state q and a state p, though individually reproducible, do not allow
reproducible 'objective' ^-states; and if an objective 0-state has been
ascertained one must not even imagine any hidden simultaneous p- value
to prevail. But this is not initiating a new philosophy of knowledge. It
merely tells us to be careful with the application of the term 'objective
state'. Of course, physicists are more impressed by the example of qp-
362 ALFRED LANDE
incompatibility than by the trivial example of ^-incompatibility. Yet
after thirty years of emphasizing differences, one may as well begin stress-
ing similarities between quantum physics and everyday experience.
Third, in this connection one ought to remember that statistical law,
as opposed to classical determinism, is known from ordinary games of
chance ; they, too, confront us with the 'miracle of statistical cooperation'
of individual events irreducible in principle [1], [2] to hidden causes.
There is no structural difference between the ordinary ball-knife game cles-
ribed above and the quantum game of Fig. \c.
Fourth a great issue has been made of y being a subjective expectation
function which suddenly collapses or contracts in violation of the 'wave
equation' when a definite observation is made, turning potentiality into
actuality. However, in spite of subjectively tainted words 'expectation'
and 'probability', the quantum theory, like any other theory in physics,
correlates experimental data rather than mental states; in particular it
correlates statistical experience gained in tests of atoms with macroscopic
instruments. If someone uses these statistical laws (which are of the
same quality as the Gauss law of errors) for placing bets or for enjoying
anticipations of future events, this is his personal affair and has nothing
to do with the quantum theory. (Similarly, nobody has yet found a
subjective element in Gauss' error law, or in Newton's law of attraction
because astronomers anticipate eclipses with high accuracy). The fiction
that quantum theory deals with differential equations for expectations
rather than with the correlation of objective data which never collapse,
has instilled utter confusion into the 'quantum theory of measurement'.
Here we learn that a ^-function, after first developing according to the
Schrodinger equation as a kind of 'process equation of motion', suddenly
collapses whenever a point event takes place (according to some authori-
ties) or only when an observer takes notice of the point event (according
to others) . But since nobody can seriously believe in such inconsistencies,
one tries at least to talk away the difficulty, as testified by extended dis-
cussions at many symposiums on 'measurement' during the last thirty
years. The chief trouble is the mistaken view that the Schrodinger
equation describes a physical change of state, either individually or
statistically. Actually if connects various mathematical 'representations'
of one and the same fixed state with one another, be it the fixed state A
before the measurement, or B after the measurement [3], [4], [5], [6].
Fifth, confusion prevails also with respect to the famous waveparticle
duality. In fact the latter has become illusory since Max Born thirty
QUANTUM THEORY FROM NON-QUANTAL POSTULATES 363
years ago introduced the statistical particle interpretation of the 'wave
function' and thereby restored a unitary particle theory, following a short
period of doubt whether matter really consisted of waves or of particles.
Before Born it was considered philosophical to argue that neither waves
nor particles are 'real'; but the same pseudo-philosophical talk has sur-
vived although physicists in their sober hours consider particles, and
particles alone, as the constituting substance of matter (in the non-
relativistic domain). Still talking of duality, i.e. drawing a parallel between
a thing (particle) and one of its many qualities (its occasional periodic
probability distribution in space and time) is illogical.
The great merit of Schrodinger's original matter wave theory had been
that it gave an explanation of the discreteness of quantum states in terms
of proper vibrations in a medium. But Born's statistical interpretation,
confirmed by the observation of point events, destroyed the ex-
planatory character of the Schrodinger waves, without substituting a
rational explanation for the wave-like phenomena. The present investi-
gation is to fill this gap. The wave-like ^-interference becomes a natural
and necessary quality of particles under the postulate that the unit magic
square P-tables arc connected by a self-consistent law, the only con-
ceivable such law is that of unitary transformation, which is identical
with that of ^-interference (6). Furthermore, the wave-like ^-periodicity,
the basis of all 'quantization', becomes a natural and obvious particle
quality under the postulates (a) (b) for conjugate observables q and p.
Postscript: The deduction on p. 360 is inconclusive. Only perturbation theory
leads to the symmetry principles.
364 ALFRED LAND£
Bibliography
[1] LAND£, A., The case for indeterminism. In 'Determinism and Freedom', edited by
Sidney Hook, New York University Press (1958), p. 69.
[2] , Determinism versus continuity in modern science. Mind, vol. 67 (1958), pp.
174-181.
[3] , Foundations of quantum theory. Yale University Press, 1955.
[4] , The logic of quanta. British Journal for the Philosophy of Science, vol.
6 (1956), pp. 300-320.
[5] , Non-quantal foundations of quantum theory. Philosophy of Science, vol.
24 (1957), pp. 309-320.
[6] § Zeitschrift fur Physik, vol. 153 (1959)pp. 389-393.
Symposium on the Axiomatic Method
QUANTENLOGIK UND DAS KOMMUTATIVE GESETZ
PASCUAL JORDAN
Universitdt Hamburg, Hamburg, Deutschland
In bekannter Weise arbeitet die Quantenmechanik mil Operatoren
oder Matrizen ; und wenn wir uns die Grundgedanken der Quantenmecha-
nik klar machen wollen, so ist es empfehlenswert, daB wir die mathemati-
schen Probleme, die mit der Theorie unendlicher Matrizen zusammen-
hangen, ganz ausschalten. Wir haben es dann, mathematisch gesprochen,
nur mit Algebra zu tun.
Wir denken uns also ein quantenphysikalisches System (Beispiele
waren leicht zu nennen), dessen meBbare Eigenschaften darzustellen sind
clurch die Matrizen eines endlichen Grades n, wobei die Matrixelemente
beliebige komplexe Zahlen sind. Die Theorie lehrt bekanntlich:
Jeder hermitischen Matrix A innerhalb der Algebra dieser Matrizen
entspricht eine meBbare GroBe (anders gesagt: eine mogliche Struktur
eines auf das System anwendbaren MeBinstrumentes). Die Eigenwcrte
der Matrix A sind die moglichen MeBresultate, die sich bei Messung von A
ergeben konnen. Mathematisch ist ja die Matrix A darstellbar in der Form
(1)
wobei die ejc orthogonale (hermitische) Idempotente sind:
wahrend die a# die Eigenwerte von A bedeuten. Die Aussagc: Als MeB-
ergebnis an der GroBc A hat sich der Eigenwert ai ergeben, kann also
ersetzt werden durch die Aussage, daB eine Messung der GroBe e\ fur
diese ihren Eigenwert 1 (und nicht ihren anderen Eigenwert 0) ergeben hat.
Wir brauchen also nur von den Idempotenten zu sprechen.
Die idempotente GroBe e\t bei deren Messung der Eigenwert 1 ge-
funden wurde, sei insbesonderc unzerlegbar, also nicht als Summe von
zwei orthogonalen Idempotenten darstellbar. Dann werde nachfolgend
ein beliebiges Idempotent e' gemessen. Wie groB ist die Wahrscheinlich-
keit, daB wir fur e' den Wert 1 finden? Die Quantenmechanik (oder die
,,statistische Transformationstheorie") antwortet:
w = Sp(elet). (2)
Mit Sp ist die Spur der Matrix e\e' gcmeint.
365
366 PASCUAL JORDAN
In dicsen Formulierungen ist der ganze grundsatzliche Inhalt der
Quantenmechanik zusammen gefaBt.
Man kann aber der Sache eine andere Fassung geben, welche mit der
soeben erlauterten mathematisch aquivalent ist. Wir betrachten eine
projektive Geometrie von n — 1 Dimension en, oder anders ausgedriickt,
wir betrachten Einheitsvektoren in einem Raum von n Dimensionen.
Die Komponenten f # solcher Vektoren sollen beliebige komplexe Zahlen
sein. Jedes unzerlegbare Idempotent ist dann darstellbar als Matrix
e1 = (f**fi) mit Sp(e') = £ |f*|8 = 1. (3)
k
(Mit f* bezeichnen wir die Konjugierte zu |). Allgemeiner besteht um-
kehrbar eindeutige Zuordnung zwischen den linearen Scharen der be-
trachteten Vektoren (oder den linearen Unterraumen der projektiven
Geometrie) und den hermitischen Idempotenten der friiher betrachteten
Matrixalgebra. Wir konnen also, statt von den Idempotenten zu sprechen,
von den zugehorigen Vektorscharen sprechen. Sind in (2) beide Idempo-
tente unzerlegbar, so haben wir (in unmittelbar verstandlicher Bezeich-
nungsweise)
w = Sp(eic') = 12 f^A;!2. (4)
Diese zweite Formulierungsweise der quantenmachanischen Grund-
gesetze lehnt sich enger als die andere an die Schrodingersche ,,Wellen-
mechanik" an.
Es ist aber eine dritte, nochmals anders aussehende Formulierungs-
weise moglich, die von Birkhoff und v. Neumann vorgetragen worden ist.
Die (n — l)-dimensionale projektive Geometrie kann mathematisch er-
klart werden als ein Verband (,, lattice") von bestimmten Eigenschaf ten ;
und die dadurch ermoglichte Einordnung der Quantenmechanik in die
mathematische Theorie der Verbande gibt uns einen iiberraschenden
neuen Einblick: Wir konnen danach den Ubergang von der klassischen
Mechanik zur Quantenmechanik — der ja gewohnlich als Ubergang von
kommutativer zu nichtkommutativer Algebra der meBbaren GroBen be-
trachtet wird — auch als einen Ubergang von distributiven Verbdnden zu
modular en Verbdnden auffasscn.
In der klassischen Mechanik konnen wir jede durch ein MeBergebnis
begrundete Information oder Aussage iiber den Zustand eines Systems so
ausdriicken, daB der den Zustand des Systems darstellende Punkt im
Phasenraum sich innerhalb einer gewissen Punktmenge a des Phasen-
QUANTENLOGIK UND DAS KOMMUTATIVE GESETZ 367
raums befindet. Der Durchschnitt a r\ b zweier Punktmengen im Phasen-
raum entspricht also der Verkniipfung der beiden zugehorigen Aussagen
durch ,,und"; die Vereinigungsmenge a v b entspricht der Verkniipfung
beider Aussagen durch ,,oder". In dieser Weisc ist unmittelbar ersicht-
lich, daB die Gesamtheit der moglichen Aussagen iiber den Zustand des
klassischen mechanischen Systems ebenso wie die Teilmengen einer
Punktmenge einen distributiven Verband bilden. Dabei entspricht fcrner
der Verneinung einer Aussage der Ubergang von der Punktmenge a zur
komplementaren Punktmenge a.
Bekanntlich sprechen wir in der Mathematik von einem Verband, wenn
fur eine gewisse Elementenmenge a, b, ... Verkniipfungen r>, w definiert
sind, welche assoziativ und kommutativ sind, und auBerdem das Axiom
(a r\ b) ^ a — a ri (b v a) = a (5)
erfullen, aus welchem die Idempotenz aller Elemente fiir diese beiden
Verkniipfungen folgt:
a r\ a = a v a = a. (6)
Bestcht zwischen zwei speziellen Elementen a, b die Beziehung a ^ b = a
(aquivalent mit a v b = b) , so schreiben wir auch aQb', diese Beziehung
des Enthaltenseins ist reflexiv und transitiv. Aus a C b und b £ a folgt
a = b.
Man nennt bekanntlich einen Verband distributiv, wenn er das zusatz-
liche Axiom
a r\ (b w c) = (a r\ b) w (a o c) (7)
erfiillt. Es gilt dann gleichzeitig auch das dazu duale, aus (7) durch Ver-
tauschung der Zeichen r», w entstehende Gesetz; eine Tatsache, die man
z.B. so beweisen kann, daB man (7) als aquivalent mit folgender dual-
symmetrischer Beziehung erweist :
(a r\b) v (a n c) v (b r\ c) = (a w b) /^ (a ^ c) r\ (b w c). (B)
Endlich sei erwahnt, daB fiir den t)bergang von einer Teilmenge a zur
komplementaren a (,,Verneinung") folgende Axiome gelten:
a = a] a n b = b v d 1
a o a = 0 = leere Menge; a v a = 1 = voile Menge J
Betrachten wir nun statt eines klassischen Systems ein quantenmecha-
nisches (wiederum mit endlichem Grad n seiner Matrixalgebra), so treten
368 PASCUAL JORDAN
an die Stelle von Punktmengen in Phasenraum die hermit ischen Idem-
potenten oder die ihnen umkehrbar eindeutig zugeordneten linearen
Unterraume der (n — 1) -dimensional en projektiven Geometric. Diese
erlauben ebenfalls Verkniipfungen r», \j, namlich im Sinne des Durch-
schnitts a r\ b von a und b, sowie des durch a und b aufgespannten li-
nearen Raumes a v b. Aber der damit definierte Verband ist nicht mehr
distributiv, sondern erfiillt statt dessen nur noch das schwachere Dede-
kindsche Modular axiom, welches — um sogleich seine ebenfalls dual-
symmetrische Bedeutung zu zeigen — folgendermafien formuliert werden
kann :
(a r» b) v [c n (a v b)] = [(a n b) v c] n (a v b). (10)
Man kanii diesen Umstand nach Birkhoff '-Neumann so ausdriicken, dafi
man von eincr Quantenlogik im Gegensatz zu ciner klassischen Logik
spricht. Natiirlich ist es Geschmacksache, ob man diese Bezeichnung an-
erkennen will; jedoch ist sic jedenfalls dann naturgemaf3, wcnn man
unter ,, Logik" die Gesetze der moglichen Verkniipfungen von Aussagen
oder Informationcn liber den Zustand eines physikalischen Systems ver-
stehen will — in dieser Auffassungsweise ist auch die Logik cine empiri-
sche Wissenschaft, weil nur empirisch klargestellt werden kann, welche
Gesamtheit rnoglicher Aussagen zu eincm bestimmtcn physikalischen
System hinzugehort. (Offenbar ist es keineswegs im Widcrspeuch hierzu,
daO man andcrerscits alle auf die Quantentheorie bezuglichen t)ber-
legungen unter alluiniger Verwendung der klassischen, also distributivcn
Logik formulieren und durchfiihren kann.)
Es gibt auch in der Quantenlogik eine Verneinung, namliche e = I — e,
fur welche die Axiome (9) gelten, wobei jetzt 0 und 1 als die durch diese
Zeichcn bezeichneten Elemente der Matrixalgebra zu verstehen sind.
Entscheidcnd fiir die Rechtfertigung der Birkhoff-Neumannschcn Be-
trachtungsweise ist aber folgender von Neumann aufgestellter mathema-
tischer SATZ: Ein modularer Verband, welcher einige zusdtzliche Eigen-
schaften hat (er mulJ irreduzibel sein, nur endliche Ketten des Enthalten-
seins zulassen, und ,,komplementierbar" sein), ist immer eine projektive
Geometric endlichcr Dimension. ,,Komplementierbar" ist er insbesondere
dann — in einer speziclleren Weise — wenn es in ihm auch eine Operation
der Verneinung in besprochener Form gibt. In diesem Falle hat der zu der
projektiven Geometric gehorige Schiefkorper insbesondere die Eigen-
schaft, welche ich mit dem Wort ,, formal komplex" bezeichnet habe. Soil
dieser Schiefkorper den reellen Zahlkorper in sich enthalten, so muB er
QUANTENLOGIK UND DAS KOMMUTATIVE GESETZ 369
entweder dieser selbst sein oder der Korper der komplexen Zahlen oder
der Schiefkorper der Quaternionen.
Ein reizvoller Neumannscher SATZ besagt iibrigens, daB die modularen
Verbande durch f olgende Eigenschaf t gekennzeichnet sind : Gilt fur drei
spezielle Elemente a, bt c die Distributiv-Beziehung (7), so ist sie invariant
gegen Permutationen dieser drei Elemente. Weitergehend kann man zeigen
(Jordan), daB dann der ganze durch a, b, c erzeugte Teilverband distributiv
ist; und das erlaubt f olgende Klarstellung : Zwei Idempotente e, e' sind
genau dann vertauschbar, also ee' = e'e, wenn zwischen e, e, e' eine Distri-
butivbeziehung besteht.
Nach diesen Vorbereitungen komme ich zur Besprechung eines Ge-
dankens, der mich zcit langer Zeit beschaftigt hat. Fiir die Weiter-
entwicklung der Quantentheorie konnte es notwendig werden, den grund-
satzlichen Formalismus der Quantenmechanik, wie er besprochen wurde,
zu erweiteren oder zu verallgemeinern. Gibt es dazu mathematische
Moglichkeiten ?
Diese Frage ist zunachst in der Weise untersucht worden, daB Verall-
gemeinerungen der assoziativen Matrix-Algebren untersucht worden
sind [1, 2, 4]. Diese Untersuchungen haben AnlaB zu einer ganzen Reihe
weiterer mathematischer Untersuchungen gegeben [5]. Jedoch soil diese
Seite der Entwicklung jetzt nicht ausfuhrlicher besprochen werden, da
sie trotz mancher reizvoller mathematischer Ergebnisse fur die Physik
bislang nichts Fruchtbares ergeben hat.
Es liegt aber nahe, eine andere Verallgemeinerungsmoglichkeit zu
studieren, darin bestehend, daB man innerhalb der Quantenlogik noch
einmal den Ubergang vom Kommutativen zum Nichtkommutativen ver-
sucht. In der Tat hat sich gezeigt, daB die Theorie der Verbande sich
durch Verzicht auf das Axiom der Kommutativitat zu einer Theorie der
,,Schrdgverbande" (skew lattices) verallgemeinern laBt, welche zwar eine
Fiille neuer, zum Teil recht schwieriger Fragen aufwirft, aber auch viele
schone Ergebnisse schon jetzt ermoglicht hat, von denen im Folgenden
nur eine kurze Andeutung gegeben werden kann. Diese nichtkommutative
Verallgemeinerung der Verbandstheorie ist zuerst von Klein-Barmen ins
Auge gefaBt, spater vom Verfasser in Angriff genommen, und unab-
hangig davon auch von Matsushita (vergleiche [3]). Die Uberlegiungen
des Verfassers sind durch die Mitarbeit von E. Witt und W. Boge ent-
scheidend gefordert worden.
Wir denken uns eine Elementenmenge mit zwei assoziativen Verkniip-
370
PASCUAL JORDAN
fungen v, A. Wir fordern ferner als Grundaxiom
(a A b) v a = a A (v a) = a,
woraus auch jetzt die Idempotenz
a*a = ava — a
(12)
folgt. Wahrend aber in (7) das kommutative Gcsetz mannigfache Um-
stellungen dcr Buchstabcn zulaBt, sollen die dadurch entstehenden For-
meln kcineswegs auch auf die Schragverbande iibertragen werden. Bei-
spielsweise wird das — von (11) unabhangige — zusatzliche Axiom
(b A a) v a --= a A (a v b) = a
(13)
nur von einer sehr speziellen, ziemlich trivialen Klasse von Schragver-
banden erfiillt.
Es gibt nun in jedem Schragverband vier Formen eines reflexiven und
transitiven Enthaltenseins, die im allgemeinen verschiedenc Bcdcutung
habcn — sind sie in cincm speziellen Schragverband alle vier gleich-
bedeutend, so ist dieser kommutativ, also ein Verband. Im allgemeinen
Falle kommen auch entsprechende Aquivalenzklassen von mehr als
eincm Element vor. Die vier Formen dcs Enthaltenseins von a in /;
sind definiert durch die Beziehungcn:
(14)
Jede Form des starken Enthaltenseins ergibt als Folgerung die zitgehorige
Form schwachen Enthaltenseins, was wir so andcutcn konncn:
links
rechts
stark
/; A a — a
b v a — b
schwach
a v b — b
a A b —- a
Das zusatzliche Axiom (13) ist dann gleichbedeutend damit, daB beide
Formen schwachen Enthaltenseins stets zugleich vorliegen:
(13')
QUANTENLOGIK UND DAS KOMMUTATIVE GESETZ 371
Schwacher als (13) ist das Axiom
a A b A a = a A b,]
avbva — bva, J
welches gleichbedeutend ist mit folgenden Beziehungen hinsichtlich des
Enthaltenseins :
(15')
Offenbar bekommen wir (15') als cine Folgerung aus (IT) und (13').
Da fur die Quantcntheoric nicht alle beliebigen Verbandc, sondern nur
modular e Verbandc von Bedeutung sind, so schcint folgende Tatsache
ermutigend — welche unabhangig von physikalischen Spekulationen auch
rein mathematisch reizvoll ist: Man kann den Begriff ,,modular" auf die
Schrdgverbdnde in einfacher und schoner Weise ubertragen. Namlich in
Gestalt des folgenden Axioms, welches genau der Formel (10) nachge-
bildet ist — es kommt jetzt aber entscheidend auf die Reihenfolge der
Zeichen an, welche in (10) aufgrund der Kommutativitat weitgehend
beliebig war:
(a A b) v \c A (a v b)] = [(a A b) v c] A (a v b). (16)
Die Analogic zum kommutativcn Fall bewahrt sich dabei auch in fol-
gendeni vSinne: Man kann die Formel (10) ersetzen durch das damit
gleichwertige Axiom, daB x Q y stets die Folgcrung
xv (cny) = (xv c) ^ y (17)
habcn soil. Ganz entsprechcnd ist (16) aquivalent mit folgender Aussage:
Ist x zweifach schwach enthalten in y, so gilt
x v (c A y) = (x v c) A y. (18)
Auch erweist sich der durch (16) definierte Begriff der ,,modularen
Schragverbande" darin als sinnvoll und angemessen, daB es tatsachlich
cine groBe Fiille von Beispielen fiir diesen Begriff gibt.
Zur Konstruktion weiter Klassen von Beispielen von Schragverbanden
ist folgendes Verfahren geeignet: Angenommen, es sei uns ein gewisser
Schragverband SB bereits gegeben ; es kann sich insbesondere um einen
kommutativen, also einen Verband handeln. Wir wollen die Verkniip-
372 PASCUAL JORDAN
fungen innerhalb von 28 mit den Zeichen r», w bezeichnen; dann aber
definieren wir in 28 neue Verkniipfungen A, v durch
a A 6
= /awif J
= a o Fb. \
Hierbei sollen die Elemente fx bzw. F# von 28 gewisse Funktionen des
Elementes # e 28 bedeuten ; und zwar mogen diese Funktionen f olgende
Eigenschaf ten haben :
F(a r\ Fb) = Fa
a n Fa = a.
(20)
is/ rft^ Elementenmenge 28 awcA zw Bezug auf die Verkniipfungen A,
v 0i« Schrdgverband.
Man kann Funktionen mit den Eigenschaften (20) in mannigfacher
Weise aufstellen, indem man spezielle Strukturen zugrunde legt. Benutzt
man insbesondere geeignete Verbande, so erhalt man Beispiele von
Schragverbanden aufgrund der Kenntnis von Verbanden.
Eine speziellere Klasse von Funktionen /, F erfiillt die oberste Zeile (20)
in der Form
f(a w b) = fa w fb, }
Entsprechendes ist fur Fx zu sagen. Wenn 28 ein Verband ist, so ergibt
sich bei dieser spezielleren Form (21) der Funktionen /, F iibrigens genau
dann Erfullung des Axioms (13), wenn
Ffa~2a\ fFaQa (22)
ist.
Denken wir uns jetzt einen beliebigen Verband 933 mit Elementen a,
b, . . . , und bilden wir das direkte Produkt von 28 mit sich selbst, also
einen Verband mit Elementen, welche Paare (a\t #2) von Elementen aus 923
sind. Daraus nehmen wir den Unterverband derjenigen Elemente, bei
denen a\ £ #2 ist. In dem so beschriebenen Verband 28 definieren wir
zwecks Erfullung von (21) und der entsprechenden Beziehungen fur Fx:
QUANTENLOGIK UND DAS KOMMUTATIVE GESETZ
373
Dieses ganz spezielle Beispiel einer Klasse von Schragverbanden erfiillt
iibrigens auch das bemerkenswerte zusatzliche Axiom
(a A b) v (b A a) = (b A a) v (a A 6),
(a v 6) A (6 v a) = (b v a) A (a v 6),
(24)
welches passend als das Axiom halbkommutativer Schragverbande be-
zeichnet werden kann, da es insbesondere immer dann erfiillt ist, wenn
mindestens eine der beiden Verkniipfungen A, v kommutativ ist.
Die mit der /, F-Konstruktion aus Verbanden abzuleitenden Halb-
verbande haben freilich trotz ihrer groBen Mannigfaltigkeit eine ihnen
gemeinsame sehr spezielle Eigenschaft : Sie erfiillen das zusatzliche Axiom
(eine Verschartfung von (15)):
avbvc — bvavc
, 1
. J
(25)
Ein Beispiel eines mod^ilaren Schragverbandes, welcher dieses Zusatz-
axiom (25) nicht erfiillt (wohl aber (15) erfiillt), ist durch folgende Ver-
kniipfungstabelle fiir die vier Elemente 0, u, v, 1 des Schragverbandes $84
gegeben, in welcher x ein belibiges Element von $84 bezeichnet:
(26)
Dieser Schragverband $84 ist fiir die Theorie der distributiven Schrag-
verbande von ahnlich grundsatzlicher Bedeutung, wie der aus nur zwei
Elementen 0, 1 bestehende Verband fiir die Theorie der distributiven Ver-
bande. Man kann allerdings den Begriff der distributiven Schragver-
bande auf mannigfach verschiedene Weise definieren, derart, daB die
Definition scharfer oder im Gegenteil toleranter gefaBt wird. Ein Beispiel
eines Distributivgesetzes fiir Schragverbande ist f olgendes :
0 A X = 0
X V 1 = f
U A X = U
x v u = u
V A % = V
x v v = v
1 A X = X
x v 0 = x
c A (b v a) = c A [b v (c A a)],
[(a v c) A 6] v c = (a A b) v c.
(27)
Dieses sehr tolerante Distributivgesetz — welches im kommutativen Fall
374 PASCUAL JORDAN
mit (7) gleichbedeutend wird — wird durch umfangreiche Klassen von
Schragverbanden erfiillt, insbesondere auch durch 284. Die oben nach der
/, F-Konstruktion mit (21) konstruierten Beispicle erfiillen, wenn fur 393
ein distributiver Verband genommen wird, cbenfalls (27).
Viel scharfere Distributivgesetze fur Schragverbande bekommt man
jedoch aus (8). Wegen des kommutativen Gesetzes kann man (8) offcnbar
in 384 verschiedenen Formen schreiben, und viclleicht haben allc diese
384 verschiedenen Schreibweisen von (8) verschiedene Bedeutung, wcnn
sie mit Zeichen A, v statt o, w geschrieben werden.
Aus Griinden, deren Erlauterung hier etwas zuviel Raurn beanspruchen
wiirde, kann man jedoch nur 6 von dicsen 384 Formen als vermutlich
bedeutimgsvoll ansehen. Diese 6 Distributivgesetze sind nicht samtlich
gleichwertig ; ob einige unter ihnen gleichwertig sein mogcn, ist noch
unentschicden. Der durch (26) definierte Schragverband 3&4 erfiillt alle 6
Beziehungen, und iiberdies noch 14 weitere, weil es in 884 einige Ober-
einstimmungen gibt, die im kommutativen Fall trivial sind, aber im
nichtkommutativen Fall keineswegs. Diese Beziehungen sollen unten
zusammengefaBt werden.
Zuvor jedoch sei zur Erlauterung der besonderen Bedeutung von 3&4
noch erwahnt: Bekanntlich kann jeder distributive Verband als Unter-
verband erhalten werden aus eincm direkten Produkt, dessen Faktoren
samtlich dem aus zwei Elementen bestehenden Verband 0, 1 entsprechen.
Analog kann eine weite, durch ein bestimmtes Konstruktionsverfahren
definierte Klasse von Schragverbanden erhalten werden durch Aussonde-
rung von Unterbereichen aus solchen Schragverbanden, welche als direkte
Produkte von direkten Faktoren 3584 entstehen. Man kann deshalb diese
Schragverbande — die also alle im Folgenden verzeichneten Eigenschaf-
ten von 3534 ebenfalls besitzen — wohl als die im schdrfsten Sinne ,,distri-
butiven" Schragverbande bezeichnen.
Alle erwahnten Feststellungen — die nur einen kleinen Ausschnitt aus
umfangreicheren Ergebnissen bilden — lassen uns freilich noch immer
weit entfcrnt bleiben von dem mir vorschwebenden Ziel, welches an-
gedeutet werden konnte als die Konstruktion von verallgemeinerten pro-
jektiven Geometrien, deren Elemente nicht mehr modulare Verbande,
sondern modulare Schragverbande bilden. Erst danach wird man beurteilen
konnen, ob die Theorie der Schragverbandc, abgesehen da von, daf3 sie ein
reiz voiles Gebiet mathematischer Untersuchung zu ergeben scheint, auch
fiir die Physik forderlich sein konnte.
Fplgende Zusatz-Axiome I, II werden durch 3$ 4 erfiillt:
QUANTENLOGIK UND DAS KOMMUTATIVE GESETZ 375
I) Folgende acht Polynome stimmen uberein:
(h A c) v (a A b) v (a A c) =- (b v a) A (c v a) A (b v c)
= (a A 6) v (b A c) v (a A c) = (b v a) A (6 v c) A (c v a)
— (b A a) v (6 A c) v (0 A c) = (b v a) A (b v c) A (a v c)
= (& A c) v (6 A a) v (a A c) = (b v a) A (a v c) A (6 v c).
II) Folgende vier Polynome stimmen uberein \
(a A b) v (c A 6) v (a A c) — (c v a) A (6 v c) A (6 v a)
(28)
(29)
= (c A b) v (a A b) v (a A c) = (c v a) A (b v a) A (ft v c). '
Bibliographic
[1] JORDAN, P., Uber eine nicht-desarguessche ebene pvojektive Geometric. Abhand-
lungcn aus dem Mathcmalischon Seminar der Universitat Hamburg, vol.
16 (1949), pp. 74-76.
[2] , Zur Theorie der Cayley-Grdssen. Akadcmic dcr Wissenschaften und der
Literatur. Abhandlungen der Mathematisch-Nalurwissen-schaftlichen Klasse.
Series 3 (1949), pp.
[3] , Die Theorie dev Schrdgverbdnde. Abhandlungen aus dem Maihcmatischen
Seminar der Universitat Hamburg, vol. 21 (1957), pp. 127-138.
[4] , J. v. NEUMANN and E. WIGNKR, On an algebraic generalization of the
quantum mechanical formalism. Annals of Mathematics, vol. 35, (1934), pp.
29-64.
[5] KOKCTTER, M., Analysis in reellen J ordan-Algebren. Nachrichten dcr Akadcmie
der Wissenschaften in Gottingen, Series Ha, Nr. 4 (1958), pp. 67-74.
Symposium on the Axiomatic Method
LOGICAL STRUCTURE OF PHYSICAL THEORIES
PAULETTE F£VRIER
Henri Poincave Institute, Paris, France
At the present time, the methodology of Theoretical Physics is not yet
well determined and clear. There are various conceptions of it according
to the different physicists, and, for some of them, the axiomatisation of
the theories is only a part of the development of Theoretical Physics.
Adequacy is the fundamental notion from the theoretical physical
point of view. A theory is adequate in a certain experimental domain if
the predictions provided by this theory on the basis of given experimental
data taken within this domain, agree with experiment. This reference to
experiment introduces notions we do not meet about mathematical
theories.
Let us consider a part of a physical theory which is axiomatised in a
suitable way. Let us leave aside the physical meaning of the terms used
in this theory. Then we get a certain mathematical theory. This theory
possesses a structure (in Bourbaki's meaning) ; we shall call it the formal
structure of the part of the physical theory under consideration.
If we have been able to axiomatise the whole theory, it receives a
formal structure in the meaning above. But one should not forget that,
in a physical theory, the terms must have a physical meaning, which is
nothing else but an intuitive meaning. This meaning being left apart,
one would have only a formal model left which would loose its interest for
the physicist. That is why the meaning must always be taken into account
together with the structure.
Many authors worked out more or less precise axiomatisations of wave
mechanics or quantum theories, every one of which has its advantages
and drawbacks, but an axiomatisation which would be in the same time
completely satisfactory and adequate, seems not yet to have been
proposed. I mean that, independently of the difficulties regarding the
formal expression, which we shall leave aside as if they were resolved,
an axiomatisation must not allow any example of inadequacy, i.e.
a physical case which should be described by the axiomatised theory
and yet escapes this description. One could give some examples of
376
LOGICAL STRUCTURE OF PHYSICAL THEORIES ' 377
particulars cases of inadequacy that have been mentioned about certain
attempts at axiomatisation of waves mechanics :
a) certain axiomatic systems show a lack of adequacy because of such
a potential V that the hamiltonian operator of the Schrodinger's equation
does not possess any longer the general properties which are required of
the operators associated with physical observables.
b) on the other hand it may be useful to examine whether some other
axiomatic systems have not to be modified or improved because of the
spectra which have points of accumulation (For instance, see Colmez'
paper [3]).
In spite of these difficulties, it is possible to realise axiomatisations of
some parts of a physical theory, but the main problem from a theoretical
physical point of view is to come to a better theory rather than to a perfect
axiomatisation of a given one. It is a matter of fact that a theory which
would interest a physicist, is never completely built up, presents some
defects, whereas a well-shaped theory in some way achieved does not
attract him any longer, probably because, when this stage is reached, he
already runs towards some other new growing theory. Processes of
formation of new theories, that is the interest of the physicist, whereas the
logician and the mathematician care for formal achievement.
Every physical theory holds in a limited experimental domain; the
problem which is always before the physicist is to find new conceptions
leading him to a new theory adequate to the experimental data un-
accounted for by the preceeding ones.
Hence, if physico-logical studies can be useful for the physicist, they
will be useful provided they are applied to a theory not completely achieved
but still in the course of its development. When once the theory is quite
built up, its inadequacies appear, the boundaries of its experimental
domain are known, and the physicist turns himself towards the building
of a new better theory. That is the reason why, from the special stand-
point of the physicist, it may be more useful to elaborate physicological
considerations in order to help his attempts at new theories, than to try
to provide, for an achieved theory, a satisfactory axiomatisation in the
most strict sense of the term.
However, the properly axiomatic enquiries about a given physical
theory are necessary, not only from a formal point of view, but also from
the point of view of the theory of knowledge.
378 PAULETTE FEVRIER
Whatever point of view we adopt, it seems to me that the first difficulty
which rises is to determine exactly what we mean by a physical theory
and by a satisfactory axiomatisation of a physical theory. What does the
physicist intend when he tries to elaborate a physical theory •? This question
appears as very important because, if we look at the considerations I
mentioned before, we can find that they are not all related to the same
meaning of the idea of a physical theory. It seems to me that such a
question can be answered in three quite different ways:
1) the aim of the physicist when he makes a new theory can be only to
find new results, that is to build up, at any rate and by any means, a
theory which enables him to predict some new experimental datum ;
2) the aim of a physical theory can be to provide what we call an ex-
planation of physical reality, that is a formal construction, adequate to
the experimental data, which connects them in a satisfactory rational
way. Presently, what we should call a "rational way" of building ex-
planations means a deductive way, according to the axiomatic method. This
conception of a physical theory does not exclude the research for adequate
predictions, but puts the formal requirements in first place ;
3) the aim of a physical theory can also be a description of physical
reality in the sense of a connexion between the set of experimental data,
and some principles and notions which are intuitively considered as funda-
mental in something like a "Weltanschauung". These primitive elements
must lead deductively to statements verified by experiment and they aim
also to supply adequate predictions, but they are chosen first with
respect to their fundamental role in the description. They rise from
various previous considerations which form, according to Destouches'
expression, an "inductive synthesis" a, p. 86; 5b, vol. I, p. 114.
Several axiomatic systems can be set up with respect to the same
experimental domain ; according to the second meaning among a physical
theory, the best theories are the most suitable among these various axiomatic
systems whith respect to the requirements of the axiomatic method.
According to the third meaning of a physical theory, the best theory is
not necessarily the best system from an axiomatic point of view, but that,
among the various systems, which depends on the most fundamental
notions in an heuristic sense.
If we come back to the first of the three preceeding meanings, we see
that it has to be considered as a minimum requirement with respect to
the question: what is a physical theory? Indeed, it founds a physical
LOGICAL STRUCTURE OF PHYSICAL THEORIES 379
theory on the single purpose of calculating predictions for future ex-
perimental data, starting from initial experimental data.
Taking this minimum requirement as a primitive assumption, one can
build up the most general physical theory, that is a general theory of
predictions [56, vol. II, pp. 505-654, vol. Ill, 705-742; 5c]. A summarization
of this theory has been given by Destouchesin his lecture It aims to be a
frame in which will enter any physical theory, I mean it aims to point
out what, in any physical theory, is involved in the particular initial pur-
pose of calculating predictions.
This general theory of predictions is, by definition, a physical theory
in the first sense of the term, but, according to the second one, it can be
axiomatised. According to the third meaning, the general theory of
predictions points to the purpose of calculating predictions as one of the
most fundamental notions in a physical theory, if we consider a physical
theory as an attempt made by a physicist in order to provide a "Welt-
anschauung" adequate, not only to the experimental data already
known, but also to future experimental data. However, a physicist who
assumes the third meaning of a physical theory requires more than the
single idea of adequate prediction to set up a particular physical theory.
The physico-logical studies do not restrict themselves to one of these
three points of view. That is why they are not formal on the whole, though
many parts of them can be formalised. They do not pretend to be more
than a help, as well for the approaches of the physicist as for those of the
metc'imathcmatician or of the philosopher.
The way in which physico-logical studies may contribute to elaborate
physical theories is the following: in order to satisfy a certain physical
condition by means of a theory that we try to elaborate, physico-logical
considerations can supply the theoretical requirements to be fulfilled by
this theory. When the physical theory must satisfy several physical
conditions, some of them being in contradiction, physico-logical consider-
ations permit us to reduce the contradictions, and to establish which
elements of the theory remain to be determined, in order to achieve it.
I shall try now to give some examples of physico-logical enquiries, in
the sense explained above.
The first task to set up a physical theory is to elaborate schemes of the
concrete physical operations as: making a measurement, reading the
result of a measurement, etc. . . . For example, from a schematic point of
380 PAULETTE FEVRIER
view, a measurement which is supposed but not effectively realised can
be assimilated to an affective measurement ; we can also assimilate a result
of measurement to a prediction for the very instant when the measure-
ment is effected.
Further, we have to represent such schemes by suitable mathematical
entities, as sets, elements, etc. In that way, let us consider more thorough-
ly the initial assumption on which the general theory of predictions is
based. We have to make precise what we mean by an "initial datum" and
by a "prediction statement", and, more generally, to determine what are
the various kinds of statements which have to be taken into account in a
physical theory.
Measurements are classified into types called observables, and deter-
mined by various experimental processes. One observable is represented
by an element of a set called set of the observables. An experimental datum
is read by the observer on the dial or the scale of a measuring apparatus
(for example, it is the position of the spot in a galvanometer). This
position is not infinitely precise. In a schematic way, we can represent it
by an intervall E with rational ends on a straight line or a circle. Its
extent is appreciated by the observer on the basis of all that he knows
about the precision of his apparatus. For instance, the precision is ob-
tained by repeating a particular measurement a rather large number of
times (we know that its results might be always the same), and by
calculating the standard deviation of the various numbers obtained in
this way. The weakest assumption we can make about it is to assume that
the result of the measurement is this very interval E and not a special
E
M • ' » » E2
Fig. 1
point, determined but unknown, inside that interval. In the case of ma-
croscopic theories we can admit that there is in E one point which is the
real result of the measurement, but in microphysics such an assumption
cannot be made.
LOGICAL STRUCTURE OF PHYSICAL THEORIES 381
Instead of an apparatus with only one dial we may have an apparatus
with several dials, which can be linear or circular. For every reading of
a dial we shall have an interval on a straight line of on a circle. In order
to get the complete result of the measurement, which is an w-interval
with rational ends, all the dials must be taken into account at the same
time, and then the result is represented in the cartesian product of the
curves on which have been represented the results taken from every dial.
This cartesian product makes up a space called "observational space of
the observable A", denoted by (RA), and such a space is associated with
each observable [1] (see figure 1).
A sentence which states an experimental datum is an empirical sentence
such as
at t0> Re Mes A C EA
where tQ is an instant of the observer's clock, Re Mes A is a certain set in
(RA), and EA a specified set in (RA)'
Though not any specified theory is assumed when we begin to set up a
general theory of predictions, however we cannot give a physical meaning
to an empirical sentence without admitting that such a meaning is pro-
vided to this empirical sentence by a certain theory, the theory by means
of which the experiment has been motivated.
In order to take into account the case of a theory with a quantization,
we have to introduce the set j/o of the possible values of an observable
A, which is a set in the observational space (RA). In the particular case of
no quantization, as in macroscopic physics,
Hence, from an empirical sentence, we obtain a so-called experimental
sentence by intersection of E and J/, that is
at *o, Re Mes A C g (where & = E ^ #/).
In the case of no quantization
JP T?
&=£!,.
As I have already said, we may in theoretical physics take under
consideration no only statements expressing facts effectively realized,
but also statements concerning supposed facts. From a schematic point
of view we can look at these supposed data in the same way as the real
data.
382
PAULETTE FEVRIER
Then, from an experimental sentence or a pair of experimental sen-
tences about one observable A, we can yield by logical means new sen-
tences which will be also experimental sentences. In this way, we obtain
a calculus for the experimental sentences concerning only one observable A .
We can then look at this calculus as a formal system. For example, we
can denote by
pi the sentence: at to, Re Mes A C d?Pi
p2 the sentence : at /o> Re Mes A C $Pz
and define
pi & p2 =d Ke Mes A C (£Pl n ^
(logical product)
pi V p2 =d Re Mes A C
' p =d Re Mes A C (j/ -
\ or superposition)
(Sec figure 2)
(negation)
From these definitions arise the rules of the calculus of experimental
sentences for one observable. When it is formalized, this system is a
Fig. 2
Fig. 3
language L\ A °f ih>e experimental sentences. It is obviously a boolean
algebra [7 a].
Now, another case must be examined. Because of a lack of precision
when we read the result of the measurement, it is possible that we cannot
state if it is the ^-interval E\ or another w-interv.'il E% which is the result
really indicated by the dial, and we can state only that the result is: E\ or
£2. In that case, the datum is no longer expressed by an experimental
sentence, since the result is no more one set, but one set or another set.
However, we know something about this result, which can be expressed
by a kind of proposition of a more general type than the experimental
sentence defined above, and that we shall call an experimental prepo-
sitional, according to a suggestion of Prof. Beth (see figure 3) .
LOGICAL STRUCTURE OF PHYSICAL THEORIES 383
The prepositional expressing that the result of the measurement is
either E\ or E% will be denoted by pi v p2, and every prepositional can be
obtained from experimental sentences by means of that operation v
called mixture or strong logical sum.
Experimental sentences may be considered as particular proposition-
als. When we formalize the system obtained in that way we have a
language LZ,A of the propositionals for the observable A.
Now we can take experimental sentences concerning several observa-
bles A , B, .... If these observables can be observed at the same time, we
form a compound observable that I shall denote A § B, and we can
reduce to the case above of one observable; § is a binary operation which
is applied to certain pairs of the set of the observables. If there is a single
pair of observables which cannot be measured together (as in micro-
physics), we have to bring in the calculation of predictions, in order to
be able to describe that special case.
As Destouches explained in his lecture, the general theory of predictions
enables us to point out a correspondence between every experimental
sentence and a subspacc ^p passing through the origin of a vector space
(<#). If that space (&) would have a finite number of dimensions, then all
observables would have a finite spectrum; hence, adequacy requires that
the space (W) be infinite dimensional. When an operation applied to ex-
perimental sentences yields an experimental sentence p, a subspace Jt p
corresponds to this sentence and the operation induces in the space (®f)
an operation on the subspaces.
Then the properties of the sentential calculus on the experimental
sentences will be those of the calculus on the associated subspaces. The
study of these properties enables us to point out the characteristics of the
theory elaborated in order to calculate predictions for future measure-
ments. Two cases are to be considered:
1) in the case of one observable, or of observables each pair of which
can be measured at the same time, the set of the associated subspaces is a
Boolean algebra. Hence, in a physical theory where all observables can be
measured at the same time, the experimental sentences follow the rules
of the classical sentential calculus ;
2) in the second case, there is at least one pair of observables which,
by right, cannot be measured at the same time. "By right" means:
according to the theory. In that case, the study of the operations on the
subspaces associated with the experimental sentences points out the
384 PAULETTE FEVRIER
characteristics of the corresponding logical operations, in such a way that
the logic which is then adequate is no longer the classical sentential
calculus, but a special logic LCS with the following rules for the logical
negation, product and sum [1; 7a, pp. 91-216]:
NEGATION : To the negation -i/> of p corresponds the sentence asserting
that the result of the measurement is not in &A\ hence the associated
subspace is the complementary orthogonal subspace.
LOGICAL PRODUCT: Because of the homomorphism between the
sentential calculus and the calculus on the subspaces ~# , with a sentence
r which is the logical product of p and q
p&q = r,
is associated the intersection of the corresponding subspaces
But, in the case where p and q relate to observables which, by right,
cannot be measured at the same time, the corresponding subspace ^p&q
is reduced to the point 0. Hence the sentences of type p&q are excluded,
that is certain pairs of sentences are not composible.
LOGICAL SUM: The conjunction "or" joining two experimental sen-
tences can take two different meanings:
Superposition : We can define a weak logical sitm p V q with the following
meaning : a measurement of A has been effected, or a measurement of B,
or a measurement of the compound observable A & B, with such an
imprecision that we can only assert that the result of the measurement
belongs to
(£A X J*B) " (S*A X *B)
J&A and 3#B being the spectra of A and B. To this operation corresponds
the sum of subspaces
_
— *^ P
^#pvq being the subspace spanned by*rfp andufg in (^).
In wave mechanics, this operation describes the notion of superposition.
In this way, the logical operations &, V, with -i, have the same properties
as the operations n, 0 and orthocomplementation of the subspaces
passing through 0 of the space (&) . Thus the sententical calculus appears
as . isomorphic to an algebra of infinite dimensional ortho-complemented
LOGICAL STRUCTURE OF PHYSICAL THEORIES 385
protective geometry. It is an ortho-complemented lattice, non modular in the
general case.
Mixture: On the other hand, the mixture of experimental sentences
leads us to a calculus of experimental propositionals, in the following way :
With the sentences p, q, we can associate a sentence p v q called strong
logical sum. It means that either the observable A has been measured,
and the result has been found in &A, or the observable B has been meas-
ured, and the result has been found in ^B ', or both observables have been
measured if they are composible, and the result has been found in $& X &B>
but we do not know which one of these three cases has been realized. To
this strong logical sum v corresponds the union of the associated sub-
spaces, which is not a subspace :
// // , . //
PVQ — P V '
Hence we are lead to distinguish from the sentential calculus a calculus
of propositionals, a propositional being a strong logical sum of sentences.
This is the language £4 of the experimental propositionals, which is a
distributive lattice in o, w.
Now, we may have to express that, for instance, if p is asserted, then q
has also to be asserted. That introduces a relation -> in a language L$.
To the relation — > corresponds the relation of inclusion for the corre-
sponding subspaces. In the same way, we shall have a language L$ for the
propositionals; and in that case, to the symbol ->• will correspond the
inclusion of the sets formed by union of subspaces.
At last, we have a language Ly which is the language of the physical
theory under consideration. In order to understand what is the language
LI, let us take the case of Newtonian mechanics: L? denotes what would
be the formalization of Newtonian mechanics when the initial conditions
are left free. (Here the experimental sentences state that the initial
conditions belong to a certain set of the phase-space). In any physical
theory, L? corresponds to the formalization of its deductive part.
We see that the general theory of predictions enables us to make
appearant the logical structure of physical theories, by means of corre-
spondence between certain subspaces and the various sets of sentences
used in these theories. Moreover, the general theory of predictions shows
thus that it must be distinguished between two kinds of physical theories,
as Destouches says in his lecture. The calculus of experimental sentences
of the quantum theories is not a boolean algebra but an algebra of pro-
386 PAULETTE FEVRIER
jective geometry, and that is, in my opinion, the most important charac-
teristic of the structure of this kind of theories.
I should like now to give another example of physico-logical consider-
ations about the comparison between these two kinds of theories.
An historical exemple of this duality of physical theories is given now
by the opposition between the so called classical probabilistic inter-
pretation of wave mechanics, and the causal and deterministic interpre-
tation proposed, some years ago, by David Bohm [2], Louis de Broglie
[4] and several other physicists.
I think that, from a physico-logical point of view, we do not have to
decide in favour of one or of the other kind of theory, because it is pos-
sible, as I shall try to show now, to find means of translating one into the
other and conversely [7b; 7c].
First, one can prove that it is possible to pass from a quantum pheno-
menalist theory to a causal theory, provided a modification be made of the
notion of the physical system described.
Let 5 be a source of particles, for example an electrons-gun; in wave
mechanics in its usual meaning, the system S that the theory plans to
describe is one electron; in a causal theory, the observed system is
determined only if we decide which experimental apparatus we put after
the gun; for instance we can put a screen with one hole of a given dia-
meter; this measuring apparatus a allows us to know, with a precision
determined by the diameter of the hole, the value of the observable A
which is the position of the particle; the system in observation is then
S/aA- We might put, instead of a screen with a single hole, a screen with
two holes (Young's holes). Then we should have a quite different system,
S/QB, which cannot be realised at the same time as S/a^. Thus, in a causal
description, in these two cases we have two different physical systems;
indeed, the boundary conditions are different, the quantum potentials
are different. In both cases the parameters which can be reached by ex-
periment are not the same. The initial conditions, in both descriptions,
are the same : they are determined by the characteristics of the gun.
In the case of the usual quantum theory, the studied system is, as we
have seen, the particle 5; what we know about the gun determines the
initial wave ; if we use the compound system-apparatus S/aA, we measure
the observable A on S ; if we use the compound system-apparatus S/djg,
we take a measurement on the observable B on 5. Thus, the observed
system is always the same system S.
LOGICAL STRUCTURE OF PHYSICAL THEORIES 387
The rules of wave mechanics supply predictions by means of proba-
bilities concerning the results which will be obtained. As we have seen
before, the set of the experimental sentences is a non-distributive, and
generally non-modular, lattice. But it admits boolean sub-lattices BA for
every observable A. And such a sub-lattice is identical to the boolean sub-
lattice of the experimental sentences concerning the system S/cu in the causal
description. This identity is what makes possible the duality of description
but the correspondence between the sentences of the two types of theories
cannot be extended further than the case of the experimental sentences for a
complete observable in the quantum description. Indeed, the lattice of
the experimental sentences of the probabilistic description is not distri-
butive, while the algebra of the experimental sentences in the causal
description is distributive. The correspondence can be extended only by
means of probabilities: to an experimental sentence expressing a maximum
observation on the system 5 (hence determining a single initial function)
corresponds a law of probability for the observable A , hence a valuation
of the boolean algebra BA, and a law of repartition for the system S/cu.
We see then that position plays a special role in a causal theory;
indeed, A may be the position ; if A is not the position, we shall observe the
system (S + a^)/as where B is an observable reducible to a measurement
of position ; thus, by changing the physical system in observation, one can
always reduce to position in a causal theory.
In this way, we see that the two kinds of theories are equivalent with
respect to a certain experimental domain, or set of facts. We can pass from
one to the other providing a modification is made of our conception of the
physical system taken under consideration.
Conversely, it can be shown that a translation is possible from a causal
theory to a probabilistic one. Let us assume a causal theory which supplies
descriptions for the systems S/cu, S/Cte, etc. Is it possible to build up a
probabilistic theory supplying the same predictions as this causal theory
but which would not contain parameters that we cannot reach by ex-
periment. In such a theory, we have to take into account only the sen-
tences corresponding to parameters which can be submitted to experi-
ment. Hence the construction of the theory must be effected in two steps:
1 ) to select, among the initial sentences in the causal theory, and for every
system S/cu, S/as, etc (S remaining the same), the experimental sen-
tences and their consequences. That can be made by a logical process
using the modality " expenmentable" . This process enables us to show that
the experimental sentences of the causal theory form a sub-set of the
388 PAULETTE FEVRIER
lattice of the experimental sentences of the probabilistic theory.
2) Then we have to join in a single description concerning 5 the partial
descriptions corresponding to every system S/dA, etc. That can be
realised ; and then it is sufficient to identify the expressions of the proba-
bilities computed according to the causal theory, and those computed
according to the general theory of predictions. Since every theory sup-
plying predictions can be put in the frame of the general theory of pre-
dictions, such an identification is possible.
Thus, if one has been able to set up an adequate causal theory in micro-
physics, one can, by eliminating the elements of the theory which cannot be
experimented upon, build up a probabilistic theory, equivalent to the given
theory in the following way : it supplies the same predictions about future
measurements. Such a theory has the same structure as the usual quantum
theory, and is essentially indeterministic. It does not contain non-ex-
perimentable observables.
From these two processes of translating one kind of theory into the
other, we can see that they are not different with respect to experimental
data or adequacy. Their difference, in fact, concerns methodological
assumptions. // one prefers a positivistic approach to elaborate physical
theories, then one cannot admit in a theory physical entities which cannot be
experimented upon, but the price of this is indeterminism. On the other hand,
if one cannot accept indeterminism, one has to assume that certain physical
entities escape experimentation.
Bibliography
[1] BIRKHOFF, G. and VON NEUMANN, J., The logic of quantum mechanics. Annals of
Mathematics, vol. 37 (1936), pp. 823-843.
[2] BOHM, D., Suggested interpretation of the quantum theory in terms of "hidden"
variables. Physical Review, vol. 85 (1952), pp. 166-193; vol. 87 (1952), p. 389;
vol. 89 (1953), p. 458.
[3] COLMEZ, J., Definition de I'operateur H de Schrodinger pour I'atome d'hydogene.
Annales scientifiques de 1'Ecole Normale Sup6rieurc, 3eme S6rie, vol. 72 (1955),
pp. 111-149.
[4] DE BROGLIE, Louis, a) Sur la possibihtd d'une interpretation causale et objective
de la mecanique ondulatoire. Comptes-rendus Acad. Sciences Paris, vol. 234
(1952), p. 265.
• b) La physique quantique r ester a-t-elle indeterministe ? Paris, 1953, VII -f- 113 pp.
LOGICAL STRUCTURE OF PHYSICAL THEORIES 389
[5] DESTOUCHES, J. L., a) Essai sur la forme generate des theories physiques. These
principale pour le Doctoral 6s Lettres, Paris, 1938; Monographies math6ma-
tiques dc 1'Universite de Cluj (Roumanie), fasc. VII (1938).
b) Principes fondamentaux de Physique theorique. Paris, 1942, IV -f 905 pp.
c) Corpuscules et systemes de corpuscules. Notions fondamentales . Paris, 1941,
342 pp.
[7] FEVRIER, P., a) La structure des theories physiques. Paris, 1951, XII -\- 424pp.
b) Sur I' Elimination des parametres caches dans une theorie physique. Journale
de Physique et Radium, Vol. 14 (1953), p. 640.
c) L' interpretation physique de la mecanique ondulatoire et des theories quantiques.
Paris, 1956, VIII + 216 pp.
Symposium on the Axiomatic Method
PHYSICO-LOGICAL PROBLEMS
J. L. DESTOUCHES
Henri Poincare Institute, Paris, France
1. Introduction. I call physico-logical problems not the purely logical
ones, but those in which both logical conditions and some physical
interpretation arise. About a physical theory there are various questions
of this kind; but these questions are not yet studied in details and we
have still to detect and specify the problems occurring and to build up
suitable methods. I shall try to set up a general survey of physico-logical
problems and to summarize the general theory of predictions.
2. Formal considerations. Let us take a physical theory which is
considered as complete by a physicist. We can, like in the case of
euclidean geometry, axiomatize and formalize it, and make about it the
same formal enquiries as about a mathematical theory. However, when
we consider a modern physical theory, it is in fact very difficult to
elaborate a suitable axiomatic system. Very often an axiomatic system
for a physical theory does not cover all physical cases ; some exceptional
case appears which does not enter the axiomatic scheme. Here I shall put
aside this purely formal point of view, and consider only physico-logical
problems.
3. The three parts of a theory. First of all, many people believe that a
physical theory taken as a whole is a deductive theory, that is a theory
based upon a few primitive terms and postulates and then developed in a
strictly deductive way. But, in fact, things are not so easy : we find a mixture
of physical notions which have to be clarified by degrees ; and the physical
theory will keep the imprint of the efforts which led to its formation. I
have called this first stage the inductive synthesis of the theory [4], which
bring us to the axiomatic part of the theory, itself the second stage. Then
comes the deductive stage, the third one. But in fact, the preceeding de-
scription is still too easy. The primitive terms and the postulates are not
introduced all together but progressively. The three stages are mixed
up with-one-another. What is the deductive part of a subtheory is at the
390
PHYSICO-LOGICAL PROBLEMS 391
same time a piece of the inductive synthesis of a more fully developed
part of the whole theory.
Formalisation can only be applied to the deductive side of the theory ;
in particular, the whole inductive synthesis cannot be formalised, but
only some parts of it.
4. Adequacy. In a physical theory, we cannot lose sight of the physical
meaning of the terms; we shall therefore remain at the level of intuitive
semantics; the requirement of adequacy to experiment dominates any
study about the notion of a physical theory. Adequacy consists in the fact
that the predictions calculated according to the considered theory, are
not at variance with experiment. At best a theory is adequate in a
certain field called the adequacy-domain of the theory [10; 4c, pp. 40-69].
5. Search for a new theory. The search for a better theory belongs to the
normal development of theoretical physics. Physico-logical considerations
allow us to find out whether a new theory should replace an older one;
and to shape a theory better than given theories.
Processes of unification of given theories can be pointed out, whether
these theories show mutual contradiction or not [5; 4b, pp. 122-147].
When we elaborate a physical theory, we generally have to take into
account incompatible conditions. Various formal processes can be used to
avoid the contradictions, but the difficulty lies in finding a formal
process appropriate to the physical requirements.
6. Formal structure. To each physical theory (as well as to each part of
a physical theory) corresponds a formal structure [3] : the structure of the
formal mathematical system in which the theory is formulated. I shall
call this formal mathematical system the algorithm of the theory. When
we pass over to a better theory, or to the unification of several theories, a
part of the formal structure of the preceeding theory is maintained [1];
it helps us to set up the new theory. For instance, if the law of con-
nexions between observers remains the same when we pass from a theory
Tho to a theory Th\, then the geometrical algorithm remains unchanged
in the new theory [1; 7], For example, in classical mechanics the geo-
metrical algorithm is the vector-calculus in the field of real numbers,
and in wave mechanics we have as the geometrical algorithm a weaker
algorithm. It is necessarily a vector-calculus. On the other hand the
general theory of predictions implies that to each observable corresponds
392 J. L. DESTOUCHES
a linear operator. So this weaker algorithm is a vector calculus on a ring
of operators.
Quite a large part of wave mechanics can be obtained by this process.
7. General theory of predictions. A more concrete level of the studies on
physical theories appears when one takes into account the fact that the
aims of a physical theory are, at the minimum, to calculate predictions
about the results of future measurements, starting from the results of
initial measurements. In that way, we are led to a general theory of pre-
dictions which has a great deal of consequences [6; 11; lOb, pp. 91-318].
If an initial experimental datum obtained by an observer Ob about an
observable A on the physical system S at an instant /o on his clock is
described by a set $A of the observational space (R^i), $A C (R^) ; and if
we are trying to calculate some prediction for the result of a measurement
which will be realised at an instant t' by an observer Ob' (in the future) on
the system S, this prediction will be expressed in terms of a function s$,
the arguments of which are of two kinds: 1°) what we know: A, <£A, to,
and 2°) what we predict : the result of the measurement which can be ob-
tained at the instant t' of the clock of Ob' by this observer Ob' and de-
scribed by a set &B of the observational space (Rs) of the observable B,
that is
(1) Prob{ReMes B C <$B at t' by O&'/ReMes AC£As.t t0 by Ob} =
The problem of prediction is the problem of the computation of the *(5-
functions.
In the most simple case we have only one initial measurement and we
consider only one observer (thus Ob' is the same observer as Ob). Here we
limit ourselves to this case.
8. Axiomatisation of measurement. That is the intuitive formulation of
the problem of prediction. We shall now describe this problem in a more
precise and more formal way. The physical system shall be described by a
constant S and a measurement by the predicate Mes ; a measurement on
the system S at time t0 with an apparatus a shall be described by
Mes(a, S, /<,)
this being a primitive term. We admit now:
PHYSICO-LOGICAL PROBLEMS 393
POSTULATE 1 : To each apparatus a corresponds an element A of a set T
called "observable" or "type of measurement".
POSTULATE 2: To each measurement at time to of type A corresponds
a set (OA which is a subset of J#A,t0 called "spectrum of A at fa" and
$A = EI X £2 X ES x ... X En. The Et are rational intervals of finite
sets or enumerable sets.
In this case we write
ReMes(<u, S, fo) £ <$A
and this is called an experimental sentence. So cu,j0 called spectrum of A,
is a subset of an w-dimensional space (R^) called the observational space
oiA.
POSTULATE 3: The number n of the sets $i depends only on the type
of measurement: n — (p(A}.
9. Axiomatisation of prediction. For an observable B, we consider the
field E of the probabilisable (or measurable) subsets of the spectrum ,$#B
of B. We call a probability for ReMes(a#, S, /) C &B where &B e E the
value of a function ty(£B) defined on E and such that
1°) 0
2°) «p(j/B) = 1
3°) $ is completely additive : ^(Ztft) = Z%(#i) if ft n ^ = 0 for all ij
4°) ^5 depends on the measured observable A, the result $A of the
measurement, the time tQ of this measurement, the observable B, the set
I^B, the time t when this measurement shall be made, and the system S.
POSTULATE 4: For a system S there exists at least one function ty
satisfying the conditions l°-4°.
For each physical theory there is a set of ^-functions which fulfill the
conditions l°-4° under the conditions fixed by the principles of this
theory. Conversely, any supplementary condition on the ^-functions
defines a class of physical theories. Thus we have a frame to discuss
general properties of a physical theory.
10. Initial elements and prediction elements. To calculate the suitable
$-f unctions, I proved [6; 11; lOb, pp. 91-318] that it is possible to do as
follows :
394 J. L. DESTOUCHES
First the initial experimental data are translated into an abstract
language in which a set ^0,4, ^,«0 of abstract elements called "initial
elements" corresponds to the datum
(2) #*<M,^,*O = ®(A, *A, to, S) and 3r0M,/..
(In wave mechanics, the set &o,A,£A,t9 reduces to the set of the initial
wave-functions fyo} compatible with the result #A, and #*0 is the sphere of
unit radius in a Hilbert space). Then an initial element XQ belonging to
is transformed at the instant t into another abstract element X(/) called
the "prediction element" by a one-one transformation U(t, to) such that
(3) X(0 = U(*, *o)X0
(In wave mechanics X(/) reduces to a wave function ip(t)).
Then the probabilities for the result <£B for a future measurement can
be calculated by a time-independent function F:
(4) %(A, £At tQ, Ob\ B, <?B, t, 06; S) - F(B, *B, X; S, Ob).
Formulas (2) and (3) are the result of the use of the auxiliary-variables-
method; (4) is a condition imposed on the evolution operator U.
The set X of all X-elements can be considered as a subset of an abstract
vector-space ((3/}.
If that space (^) would have a finite number of dimensions, then all
observables would have a finite spectrum. Hence, adequacy to experi-
ment requires that the space (^) be infinite dimensional.
1 1 . Decomposition of a spectrum. If ® is a decomposition of the
B-spectrum and ^ an element of ®
there exists at least one X< in 3C for which
(5) F(B,*,fX,;SfO&) = l
that is an X$ which guarantees that the B-experimental datum shall be
PHYSICO-LOGICAL PROBLEMS 395
included in f^. These Xf can be defined as eigenf unctions of a linear
operator in (^). Therefore an operator is associated with each observable
by a formal process, (without any physical hypothesis; the physical
content is introduced by the analytical form of the operator when this
form is given explicitly) [6; 11; 1 Ob, pp. 91-318].
It is possible to define an equivalence modulo J5, 5) in which
(6) X s 2 *<X< modB, 3>
i
In many cases, with a convenient definition of an abstract integral [6h ;
18; 19], when there is a continuous part in the spectrum for B, the sum
2 CiKi has a limit when we consider the set 775) of every decomposition
i
of the ^-spectrum ; in this case we have
(7) X ==fc(df)X(d#), mod B, 77$
12. The spectral decomposition theorem. In (6), ct is a complex number
and there exists at least one function fB*S) for which
(8) F(B,<?i,X;S,Ob)=fB<$)(c{)
It is possible to choose a function /# independant of the decomposition ®
of the B-spectrum [6i, pp. 529-538] ; in this case the function fB must be a
solution of the Cauchy's equation
and fulfills some accessory conditions like /(O) = 0. If we exclude the
total discontinuous solutions of Hamel, the only (continuous) solutions
are
fB(Ci) = \Ci\* and k > 0.
Moreover there exists one and only one universal function / independent
of B and 3) when there is a pair of observables which are not simul-
taneously measurable [6i, pp. 538-540; lOb, pp. 221-233], that is a
unique value for the constant k which is the same for all observables B.
In the case where all observable are simultaneously measurable (as in
classical physics) the value of k remains undetermined under the con-
dition k > 0.
The physical consequence of this fact is that in classical physics, there
exist no interferences of probabilities; on the contrary in quantum
396 J. L. DESTOUCHES
physics, where non-simultaneously measurable observables exist, there
are interferences of probabilities. Hence the value of k is an important
property of a physical theory with non-simultaneously measurable
observables.
P. Fevrier has proved [12] that the constant k is equal to 2, so that
the following spectral-decomposition theorem is valid :
THEOREM : In the case where there exists at least one pair of non simul-
taneously measurable observables, the universal function f is
f(ct) = N2;
hence k = 2.
In this case where there exists a non-simultaneous pair of observables,
it can be proved that the general formalism of predictions cannot be
reduced to a simpler one. On the contrary, when all observables are
simultaneously measurable, (in this case the value of k remains arbitrary
under the condition k > 0, and in particular we can put k = 2), the
general formalism of prediction calculus is valid, but it can be reduced to
a simpler one, that is a phase-space scheme.
So there are two types of physical theories, and only two: in the first
type, there is at least one non-simultaneously measurable pair of ob-
servables ; in the second type, all observables are simultaneously measur-
able. The classical theories are of the last type, and the quantum ones are
of the first type.
13. Miscellaneous notions. 1°) A theory is called objectivistic if it is
possible to eliminate apparatus of measurement from the theoretical
formulation of phenomena. In this case the formalism of prediction
calculus is reducible to a phase-space-scheme, and thus is of the second
type.
On the contrary, a theory is called subjectivistic if an essential role is
played by observers and apparatus of measurement in the theoretical
formulation of the phenomena. This intuitive definition is interpreted
formally as "the general formalism of prediction calculus for this theory is
not reducible". It results from the above that a subjectivistic theory is of
the first type; reciprocally a theory of the first type is subjectivistic.
2°) An observable B derives from an observable A if it is possible to
compute the value of B at to when the result of a measurement of A ' at to
is known. A theory admits a state-observable if there exists an observable
PHYSICOLOGICAL PROBLEMS 397
such that all obscrvablcs derive from it. In the other case, a theory is
without state-observables.
It can be proved that if a theory is without state-observables this theory
has at least one pair of non simultaneously measurable observables and so
is of the first type. It is obvious that a theory with a state-observable has
all observables simultaneously measurable and is of the second type.
3°) An experimental datum is a result of a measurement ; it depends on
the observed system and on the apparatus of measurement. Then if an
experimental datum is an intrinsic property of the observed system, this
experimental datum is independent of the apparatus of measurement. If
all experimental data are intrinsic properties of the observed physical
system, then the apparatus of measurement does not play an essential
role and the theory is objectivistic.
On the contrary, if the experimental data are not intrinsic properties of
the system, they depends on the apparatus and they play an essential role
in the theoretical description, so that the theory is subjectivistic.
4°) An imprecise experimental datum is analy sable, if it is equivalent
either to consider the result & of the measurement ($ is a set, see postulate
2), or to consider the result $\ or the result ^2, when £\ w ^2 = $ • In
other terms, a result of a measurement is analysable if for every pre-
diction it is equivalent to consider the experimental sentence p corre-
sponding to «f , or to consider the logical sum p\ v p% (where p\ corre-
sponds to <^i and P2 to ^2).
When an imprecise experimental datum is not analysable, it is im-
possible to attribute a precise but unknown value to the measured
observable. By means of the connexion between experimental sentences
and closed linear manifolds in the space (°&) it can be proved [lOb, pp.
156-159, 275-280] that, if the imprecise experimental data arc all ana-
lysable, then the theory is objectivistic, and if there is some non-analys-
able experimental datum, then the theory is subjectivistic.
5°) The term by right means: "with respect to the requirements of the
theory". On the other hand in fact would mean: "with respect to experi-
ment".
It is very difficult to describe formally the notion of complementarity.
In order to be complementary, two observables must be non-simultane-
ously measurable by right. That condition can be taken as a formal des-
cription of complementarity ; hence a theory with complementarity is a
theory including non-simultaneously measurable observables.
6°) A theory is deterministic by right if there exists at least one initial
398 J. L. DESTOUCHES
element X0 such that from this element X0 it is possible to predict with
certainty the value of all observables at any time. A theory is called
essentially indeterministic if it does not contain such an X0. It can be
proved that a subjectivistic theory is essentially indeterministic, and that
an essentially indeterministic theory (i.e. a theory with indeterminism
by right) is a subjectivistic theory [lOb, pp. 241-244, 260-284].
7°) In a subjectivistic theory, it is necessary to use an apparatus in
order to obtain some information on the observed physical system, and
that apparatus cannot be eliminated from the theoretical description.
Conversely if the use of an apparatus is essential by right (and not only in
tact) the theory is subjectivistic.
The preceeding notions can be defined more precisely; to each
physical notion corresponds a definite term in the formal description of
the physical facts, that is in the formalism of the prediction calculus;
such definitions bring in, in a precise way, the properties pointed out
here.
From these definitions it follows that the uniqueness of the /-function
and the form imposed by the spectral decomposition theorem is a con-
sequence of only one of the following assumptions, and any one of them
implies the others:
1) the theory is a subjectivistic one,
2) there is no state-observable,
3) an experimental datum is not an intrinsic property of the observed
physical system,
4) imprecise experimental data cannot be analysed,
5) there are two observables not simultaneously measurable,
6) there is some complementarity,
7) there is essential indeterminism,
8) by right it is necessary to use an apparatus in order to obtain some
information on the observed physical system.
This last condition is the most intuitive for microphysics and it can be
placed as postulate under the form of principle of observability [13; lOb,
pp. 316-318].
On the contrary, if we assume the negation of one of the above as-
sumptions, this implies the negation of the others and the prediction
scheme reduces to a phase-space scheme. These conditions have as a
consequence that the observable physical systems can be divided into
two classes:
1) systems which are, by right, directly observable by means of the
PHYSICO-LOGICAL PROBLEMS 399
sense organs of the observers (i.e. systems for which all observables are
simultaneously measurable by right).
2) systems which, by right, can only be observed indirectly by means of
certain systems of the preceeding class called "apparatus" (i.e. systems in
which there exists at least one pair of non-simulanteously measurable
observables) .
14. The principle of evolution. In the general formalism of our pre-
diction calculus, the evolution of the observed physical system S is de-
scribed only by the It-evolution operator. Any condition concerning the
evolution of S consists in a condition assigned to U(t, to).
To determine the evolution of this U-operator it is natural to admit the
following principle as a fundamental property for predictions [14]: "If
during the time interval (to, t] no measurement is realised on the observed
physical system S, (an initial measurement being made at /o)> then the
prediction for an instant r (between to and t) has an effect upon the pre-
dictions for the instant t, and this for all T".
Any prediction for the instant r is obtained from a predictional-
element X(T) and X(r) = U(r, /o)Xo. A prediction for the instant t is
calculated from the predictions for different times between IQ and t. That
is, any prediction for an instant r is considered as an indication for
computing a prediction for the instant t. A prediction for the instant r is
computed from X(r) (by the spectral decomposition theorem); in other
words this indication is described by X(r), and the contribution of X(r) in
order to calculate X(t) is an element Y(£, T) obtained as a function of
X(r), that is
Y(*,T)=8f,(*,T)X(T)
where 5* (t> T) is an operator.
Considering n + 1 instants
TO = ^0> TI, T2, . . ., Ti, . . ., rn-\j Tn = t
we shall have
The process used to define an integral gives us
400 J. L. DESTOUCHES
where
X0(/)=lim3f»(Uo)Xo./lTo
n->oo
This is a functional equation for X(/), we have
= U(f, to)X0,
hence
t
U(/, to) == «(*, fo) +/SP, T)U(r, -
fo
with «(*, *0)X0 = X0(0-
The equation for the operator U(t, to) has the form of a Volterra's
integral equation of hereditary process, but it is an equation between
operators and not an equation between functions.
If $i(t, to) has the properties of an evolution operator, it can be inter-
preted as the evolution operator of a fictive system So called a substratum
for S. Also S can be interpreted as a perturbed system and SQ as a non
perturbed system. The equation in U can be solved by a process of suc-
cessive approximations; the first step gives the usual perturbation of
first order and the upper steps the perturbations of higher orders [18].
In the general case, U is not derivable and there is no Hamiltonian,
and thus no wave equation ; but in many particular cases, U has a time
derivative and obeys a differential equation:
where $ is an operator called the Hamiltonian.
We have
if tyi(t, to) obeys an equation of this form. We have the wave equation if
Uo and $(t, r) have a time derivative and if $(t, r) tends to a limit when r
tends to t. But in general 3 (t, r) does not tend to a limit when r tends to t
and there is only the integral operatorial equation to describe the evo-
lution of the system.
15. Experimental sentences. The general theory of predictions leads us
to single out sentences of a special type: the experimental sentences on
PHYSICOLOGICAL PROBLEMS 401
which a calculus can be defined. Thus we get an algebra, which plays an
important part in the physical theories under consideration [15; lOb,
pp. 91-215].
16. Search for new theories. Physico-logical studies alone do not allow
us to build a new physical theory [16; 4c, pp. 54-60]. A new theory can
only be obtained by thoroughly deepening the meaning of the purely
physical notions of a theory. But physico-logical studies definitely help
us. For example, in the recent discussions about the quantum theories,
concerning the discrepancy between the statistical interpretation and the
causal one, the physico-logical considerations served to yield precise
answers: if we have an essentially indeterministic theory, it is always
possible to construct a deterministic theory which gives us the same re-
sults (i.e. the same predictions concerning future measurements) under
the following conditions : i) the notion of a physical system is not the same
in both theories, ii) we must add hidden parameters, some belonging to
the physical system (in the sense of the indeterministic theory) and some
to the measuring apparatus; moreover some of these hidden parameters
are not measurable in any way (they are metaphysical parameters)
[8; 12c, pp. 43-100]. Reciprocally P. Fevrier has proved [17; 12c, pp.
135-150] that, if we have a deterministic theory with hidden parameters,
by eliminating these parameters and modifying the notion of a physical
system, we obtain an essentially indeterministic theory. Hence the notions
of determinism and indeterminism are not physical notions, properties of
nature, but are relative to the theoretical requirements.
17. Various levels. Whereas, in the study of mathematical theories, it is
enough to distinguish two levels: the theoretical one, and the metatheo-
retical one, or in other words, the language and the metalanguage, in the
study of physical theories, we have to distinguish a greater number of
levels: for instance, the language of the experimental sentences, the
language of predictions, the language of the theory, the metalanguage
[15].
1 8. Various approaches. Physico-logical studies are still little developed,
and many problems are to be formulated. To end, I shall point out the
main approaches as follows :
a) To study in a strictly logical way a given physical theory only taken
as a deductive theory.
402 J. L. DESTOUCHES
b) To elaborate general physico-logical considerations when a con-
nexion with experiment is introduced by the notion of adequacy.
c) To come to more particular physico-logical considerations when the
formal structure of a theory is taken into account.
d) To draw the consequences of the following notions: measurements,
experimental statements, predictions. That is to say: to work out the
general theory of predictions.
e) To study the calculus of experimental sentences and enter into
epitheoretical considerations about the general theory of predictions.
/) In particular, the physico-logical researches allow us to separate in a
physical theory the intrinsic (or objectivitic) properties of the physical
objects, from those which are intrinsic properties of the compound
object-apparatus, but not of the objects themselves. Criteria for the
intrinsic and extrinsic properties have been mentioned.
These considerations on intrinsic and non-intrinsic properties played
an important part in the recent developments of physical theories,
namely in the elaboration of the functional theory of particles. In this
theory, a particle is no longer described by a point, but by a function u
or a finite set of functions «$ . I have not space enough here to give details
about this theory which I have developed in recent papers [9J.
1 9. Conclusion. The modern physical theories involve such various and
mixed levels of thought that, besides purely physical, logical and mathe-
matical considerations, they need intermediate researches in order to
connect together these different kinds of developments.
Physico-logic is such an intermediate field, and that is the reason why
physico-logical methods do not quite fulfill the formal conditions required
either from a physical theory or from a logical one. But one cannot hope
to surmount the present heavy difficulties of theoretical physics only by
means of the formal achievement of reasonings. Adequacy has to be
realised first of all by a physical theory and, for that purpose, physico-
logical studies can be very helpful and set the theoretical developments in
their right connection with experiment. They are presently in their first
stage, like the studies about foundations of mathematics at their be-
ginning; the formal achievement does not appear at the beginning, it
depends on the efficiency of the methods under consideration, and, on the
other hand, their efficiency depends on their formal strictness. Physico-
logical studies must be broadly developed in both directions, and play
an important part in the future.
PHYSICO-LOGICAL PROBLEMS 403
Bibliography
[1] AESCHLIMANN, F., a) Sur la persistance des structures geomttriques dans le de-
veloppement des theories physiques. Comptes Rcndus des stances de l'Acad£mie
dcs Sciences de Paris, vol. 232 (1951), pp. 695-597.
b) Recherches sur la notion de systeme physique. These de Doctoral es-Sciences,
Paris 1957.
[2] and J. L. DESTOUCHES, L' electromagndtisme non lineaire et les photons en
theorie fonctionnelle des corpuscules. Journal de Physique et le Radium, t. 18
(1957), p. 632.
[3] CAZIN, M., a) Algorithmes et theories physiques. Comptes Rendus des s6ances
de 1'Academie des Sciences de Paris, t. 224, pp. 541-543.
b) Algorithmes et construction d'une theorie unifiante. Comptes Rendus des
seances de 1'Academie des Sciences, t. 224 (1947), pp. 805-807.
c) Persistance des structures formelles dans le developpement des theories phy-
siques. These de Doctorat Univ. Paris, Lettres-Philosophie, Paris 1947.
d) Les structures formelles des mecaniques ondulatoires et leur persistance dans
les nouvelles tentatives theorique. These de Doctorat es-Sciences, Paris 1949.
1 4] DESTOUCHES, J. L., a) Essai sur la forme generate des theories physiques. These
pnncipale pour le Doctorat es-Lettres, Paris 1938. Monographies mathe-
matiqucs de rUniversit6 de Cluj, fasc. VII, Cluj (Roumaiiie) 1938.
b) Principes fondamentaux de Physique theorique. Vol. 1, Paris 1942, 174 -f
IV pp.
c) Traite de physique theorique et de physique mathematique , t. I. Methodologie,
Notions geometriques, vol. I, Paris 1953, 228 -|- XIV pp.
[5J , a) Unite de la physique theoriques. Comptes Rendus des s6ances de 1'Aca-
demie des sciences de Paris, vol. 205 (1947), pp. 843-845.
b) Essai sur I' Unite de la physique theorique. These complementaire pour le
Doctorat es-Lcttres, Paris 1938; Bulletin scientifique de 1'Ecole poly-
technique cle Timisoara, Roumanic 1938.
|_6] a) Les espaces abstraits en Logique et la stabilite des propositions. Bulletin
de l'Acad6mie royale de Bclgique (classe des sciences) 5° ser., vol. XXI (1935),
pp. 780-86.
b) Le rdle de la notion de stabilite en physique. Bulletin de 1'Academie royale
de Belgique (classe des sciences) 5° ser., vol. XXII (1936), pp. 525-532.
c) Conditions minima auxquelles doit satisfaire une theorie physique. Bulletin
de l'Acad6mie royale de Belgique (classe des sciences) 5° se"r., vol. XXIII
(1937), pp. 159-165.
d) Loi genemle devolution d'un systeme physique. Journal dc Physique et le
Radium, ser. 7, vol. 7 (1936), pp. 305-311.
e) La notion de grandeur physique. Journal de Physique et le Radium, se"r.
7, vol. 7 (1936), pp. 354-360.
f) Le principe de Relaiivite et la theorie gdnerale de devolution d'un systeme
physique. Journal de Physique et le Radium, ser. 7, vol. 7 (1936), pp
427-433.
g) Les previsions en physique theorique. Communication au Congres inter-
404 J. L. DESTOUCHES
national de Philosophic des Sciences, Octobre 1949, Actuality's scientifi-
ques et industrielles Hermann, Paris 1949.
h) Corpuscules et Systemes de Corpuscules, Notions fondamentales. Vol. 1,
Paris 1941, 342pp.
i) Principes fondamentaux de physique iheorique. Vol. II, Paris 1942, 484 +
VI pp.; vol. Ill, Paris 1942, 248 + IV pp.
j) Uber den Aussagenkalktil der Experimentalaussagen. Archiv fiir mathe-
matische Logik und Grundlagenforschung, Heft 2/2-4, pp. 424-25.
[7] , Cours mimcogr. Faculte" des Sciences, Paris 1957.
[8] , a) Sur V interpretation physique de la Mecanique ondulatoire et I'hypothese
des parametres caches. Journal de Physique et le Radium, vol. 13 (1952), pp.
pp. 354-358.
b) Sur V interpretation physique des theories quantiques. Journal de Physique
et le Radium, vol. 13 (1952), pp. 385-391.
[9] , a) Funktionnelle Theorie der Elementarteilchen. Vorlesung Pariser Uni-
versitatswoche, Miinchcn 1955, pp. 176-183.
b) Fonctions indicatrices de spectres. Journal de Physique et le Radium, vol.
17 (1956), p. 475.
c) Quantization in the functional theory of particles. Nuovo Cimento, suppl.
vol. Ill, s6r X (1956), pp. 433-468.
d) La quantification en theorie fonctionnelle des corpuscules. Vol. 1, Paris 1956,
VI + 144pp.
e) Le graviton et la gravitation en theorie fonctionnelle des corpuscules. Comptes
Rendus des stances de 1' Academic des Sciences de Paris, vol. 245 (1957),
pp. 1518-1520.
f) La gravitation en theorie microphysique non lineaire. Journal de Physique et
le Radium, vol. 18 (1957), p. 642.
g) Le graviton en theorie fonctionnelle des corpuscules. Journal cle Physique et
le Radium, vol. 19 (1958), pp. 135-139.
h) Journal de Physique et le Radium, vol. 19 (1958) (sous presse)
i) Corpuscules et champs en theorie fonctionnelle. vol. 1, Paris 1958, VIII +
164pp.
j) Les systemes de corpuseules en theorie fonctionnelle (A.S.L. Hermann, Paris
1958).
[10] FEVRIER, P., a) Recherches sur la structure des theories physiques. These Sciences
Math. Univers., Paris 1945.
b) La structure des theories physiques. Paris, 1951, XII -\- 424 pp.
c) Logical Structure of Physical Theories. This volume.
[H] 1 a) Determinisme et inddterminisme. Vol. 1, Paris 1955, 250 pp.
b) L' interpretation physique de la Mecanique ondulatoire et des theories quan-
tiques. Vol. 1, Paris 1956, 216 pp.
c) Determinismo e indeterminismo. Vol. 1, Mexico 1957, 270 pp.
[12] , a) Signification profonde du principe de decomposition spectrale. Comptes
Rendus des stances de rAcad6mie des Sciences de Paris, vol. 222 (1946), pp.
866-868.
b) Sur I' interpretation physique de la Mecanique ondulatoire. Comptes Rendus
des seances de l'Acad6mie des Sciences de Paris, vol. 222 (1946), p. 1087.
PHYSICO-LOGICAL PROBLEMS 405
c) L' interpretation physique de la Mtcanique ondulatoire et des theories quan-
tiques. vol. 1, Paris 1956, 216 pp.
[13] , Monde sensible et monde atomique. Theoria (Philosophical Miscellany
presented to Alf Nyman), 1949, pp. 79-88.
[14] , a) Sur la recherche de I' equation fonctionnelle devolution d'un systeme en
Morie gendrale des provisions. Comptes Rendus des stances de I'Acad^mie des
Sciences de Paris, Vol. 230 (1950), pp. 1742-1744.
b) Sur la notion de systeme physique. Comptes Rendus des stances de l'Acad6-
mie des Sciences de Paris, vol. 233 (1959), p. 604.
[15] , La logique des propositions experimentales. Actes du 2° colloquc de Lo-
gique math6matique de Paris 1952, Paris 1954, pp. 115-118.
[16] , a) Sur la notion d* adequation et le calcul minimal de Johansson. Comptes
Rendus des stances de l'Acad6mie des Sciences de Paris, vol. 224 (1947), pp.
545-548.
b) Adequation et ddveloppement dialectique des theories physiques. Comptes
Rendus des s6ances de I'Academie des Sciences de Paris, vol. 224 (1947),
pp. 807-810.
[17] , Sur I' elimination des parametres cache's dans une theorie physique. Journal
cle Physique et le Radium, vol. 14 (1953), p. 640.
[18] GUY, R., a) Comptes Rendus stances de l'Acad6mie des Sciences de Paris,
1950-1953.
b) These de Doctorat es- Sciences math6matiqucs, Univ. Paris 1954.
[19] NIKODYM, O. M., Remarques sur les integrates de M. J. L. Destouches conside-
rees dans sa thdorie des provisions. Comptes Rendus des seances de 1'Acad^mie
des Sciences de Paris, vol. 225 (1947), p. 479.
PART III
GENERAL PROBLEMS AND APPLICATIONS
OF THE AXIOMATIC METHOD
Symposium on the Axiomatic Method
STUDIES IN THE FOUNDATIONS OF GENETICS
J. H. WOODGER
University of London, London, England
In what follows a fragment of an axiom system is offered — a frag-
ment because it is still under construction. One of the ends in view in
constructing this system has been the disclosure, as far as possible, of what
is being taken for granted in current genetical theory, in other words the
discovery of the hidden assumptions of this branch of biology. In the
following pages no attempt will be made to give a comprehensive account
of all the assumptions of this kind which have so far been unearthed;
attention will be chiefly concentrated on one point — the precise formu-
lation of what is commonly called Mendel's First Law, and its formal
derivation from more general doctrines, no step being admitted only
because it is commonly regarded as intuitively obvious. Mendel's First
Law is usually disposed of in a few short sentences in text-books of
genetics, and yet when one attempts to formulate it quite explicitly and
precisely a considerable wealth and complexity of hidden assumptions is
revealed. Another and related topic which can be dealt with by the
axiomatic method is the following. Modern genetics owes its origin to the
genius of Mendel, who first introduced the basic ideas and experimental
procedures which have been so successful. But it is time to inquire how
far the Mendelian hypotheses may now be having an inhibiting effect by
restricting research to those lines which conform to the basic assumptions
of Mendel. It may be profitable to inquire into those assumptions in order
to consider what may happen if we search for regions in which they do
not hold. The view is here taken that the primary aim of natural science
is discovery. Theories are important only in so far as they promote
discovery by suggesting new lines of research, or in so far as they impose
an order upon discoveries already made. But what constitutes a dis-
covery? This is not an easy question to answer. It would be easier if we
could identify observation and discovery. But the history of natural
science shows abundantly that such an identification is impossible.
Christopher Columbus sailed west from Europe and returned with a
report that he had found land. What made this a discovery was the fact
408
STUDIES IN THE FOUNDATIONS OF GENETICS 409
that subsequent travellers after sailing west from Europe also returned
with reports which agreed with that of Columbus. If the entire American
continent had quietly sunk beneath the wave as soon as Columbus's back
was turned we should not now say that he had discovered America, even
although he had observed it. If an astronomer reported observing a new
comet during a certain night, but nobody else did, and neither he nor
anybody else reported it on subsequent nights, we should not say that he
had made a discovery, we should say that he had made a mistake.
Observations have also been recorded which have passed muster for a
time but have finally been rejected, so that these were not discoveries.
Moreover, there have been observations (at least in the biological sciences)
which have been ignored for nearly fifty years before they have been
recognized as discoveries. Theories play an important part in deciding
what is a discovery. Under the influence of the doctrine of preformation,
in the early days of embryology, microscopists actually reported seeing
little men coiled up inside spermatozoa. Under the influence of von Baer's
germ-layer theory the observations of Julia Platt on ecto-mesoderm in
the 1890s were not acknowledged as discoveries until well into the
twentieth century. Such considerations raise the question : is Mendelism
now having a restricting effect on genetical research?
The distinction between records of observations and formulations of
discoveries is particularly sharp in genetics ; as we see when we attempt to
formulate carefully Mendel's observations on the one hand and the dis-
coveries attributed to him on the other. It will perhaps make matters
clearer if we first of all distinguish between accessible and inaccessible
sets. Accessible sets are those whose members can be handled and counted
in the way in which Mendel handled and counted his tall and dwarf garden
peas. Inaccessible sets, on the other hand, are those to which reference
is usually being made when we use the word 'all'. The set of all tall
garden peas is inaccessible because some of its members are in the remote
past, some are in the (to us) inaccessible future, and some are in in-
accessible places. No man can know its cardinal number. But observation
records are statements concerning accessible sets and formulations of
discoveries are statements concerning inaccessible sets. The latter are
therefore hypothetical in a sense and for a reason which does not apply
to the former statements, But there are other kinds of statements about
inaccessible sets in addition to 'air-statements. In fact, from the point
of view of discoveries, the latter can be regarded as a special case of a more
general kind of statement, namely those statements which give expression
410 J. H. WOODGER
to hypotheses concerning the proportion of the members of one set, say X,
which belong to a second set Y. When that proportion reaches unity we
have the special case where all Xs are Ys. In the system which is given
in the following pages the notation 'pY* is used to denote the set of all
classes X which have a proportion p of their members belonging to Y,
p being a fraction such that 0 < p < 1 . This notation can be used in
connexion with both accessible and inaccessible sets. In the latter case
it is being used to formulate statements which cannot, from the nature
of the case, be known to be true. Such a statement may represent a leap
in the dark from an observed proportion in an accessible set, or it may be
reached deductively on theoretical grounds. In either case the continued
use of a particular hypothesis of this kind depends on whether renewed
observations continue to conform to it or not. Statistical theory provides
us with tests of significance which enable us to decide which of two
hypotheses concerning an inaccessible set accords better with a given set
of observations made on accessible sub-sets of the said inaccessible set.
In the present article we are not concerned with the questions of testing
but with those parts of genetical theory which are antecedent to directly
testable statements. At the same time it must be admitted that more is
assumed in the hypothesis than that a certain inaccessible set contains a
proportion of members of another set. As observations take place in
particular places, at particular times, must there not be an implicit
reference to times and places in the hypotheses concerning inaccessible
sets, if such hypotheses are to be amenable to testing against observa-
tions ? Consider, for example, the hypothesis that half the human children
at the time of birth are boys. This would be the case if all children born in
one year were boys and all in the next year were girls, and so on with
alternate years, provided the same number of children were born in each
year. But clearly a more even spread over shorter intervals of time is
intended by the hypothesis. Again, there cannot be an unlimited time
reference, because according to the doctrine of evolution there will have
been a time when no children were born, and if the earth is rendered
uninhabitable by radio-activity a time will come when no more children
are born. Thus a set which has accessible sub-sets during one epoch may
be wholly inaccessible in another.
In what follows no attempt will be made to solve all these difficult
problems; we shall follow the usual custom in natural science and ignore
them. Attention will be confined to the one problem of formulating
Mendel's First Law. In the English translation of Mendel's paper of 1 866,
STUDIES IN THE FOUNDATIONS OF GENETICS 41 1
which is given in W. BATESON'S Mendel's Principles of Heredity, Cam-
bridge 1909, we read (p. 338):
Since the various constant forms are produced in one plant, or even in one
flower of a plant, the conclusion appears to be logical that in the ovaries of the
hybrids there are as many sorts of egg cells, and in the anthers as many sorts of
pollen cells, as there are possible constant combinations of forms, and that
these egg and pollen cells agree in their internal composition with those of the
separate forms.
In point of fact it is possible to demonstrate theoretically that this hypothesis
would fully suffice to account for the development of the hybrids in the sepa-
rate generations, if we might at the same time assume that the various kinds of
egg and pollen cells were formed in the hybrids on the average in equal num-
bers.
Bateson adds, in a foot-note to the last paragraph : This and the preceding
paragraph contain the essence of the Mendelian principles of heredity.'
It will be shown below that much more must be assumed than is ex-
plicitly stated here. L. Hogben, in Science for the Citizen, London, 1942, in
speaking of Mendel's Second Law mentions the first in the following
passage (p. 982) :
It is not, however, a law in the same sense as Mendel's First Law, of segregation,
which we have deduced above, for it is only applicable in certain cases, and as
we shall see later, the exceptions are of more interest than the rule.
But surely, Mendel's First Law is also only applicable in certain cases,
and if this is not generally recognized it is because the law is never so
formulated as to make clear what those cases are. We cannot simply say
that if we interbreed any hybrids the offspring will follow the same rules
as were reported in Mendel's experiments with garden peas, because it
would be possible to quote counter-examples. It is hoped that the follow-
ing analysis will throw some light on this question and that in this case
also the exceptions may prove to be of at least as much theoretical interest
as the rule. It will be shown that the condition referred to in the second of
the above two paragraphs from Mendel's 1866 paper is neither necessary
nor sufficient to enable us to derive the relative frequencies of the kinds
of offspring obtainable from the mating of hybrids. It is not sufficient
because it is also necessary to assume (among other things) that the union
of the gametes takes place as random. It is not necessary because if the
random union of the gametes is assumed the required frequencies can be
derived without the assumption of equal proportions of the kinds of
412 J. H. WOODGER
gametes. At the same time it will be seen that a number of other as-
sumptions are necessary which are not usually mentioned and thus that a
good deal is being taken for granted which may not always be justified.
When we are axiomatizing we are primarily interested in ordering the
statements of a theory by means of the relation of logical consequence;
but where theories of natural science are concerned we are also interested
in another relation between statements, a relation which I will call the
relation of epistemic priority. A theory in natural science is like an ice-
berg — most of it is out of sight, and the relation of epistemic priority
holds between a statement A and a statement B when A speaks about
those parts of the iceberg which are out of water and B about those parts
which are out of sight; or A speaks about parts which are only a little
below the surface and B about parts which are deeper. In other words : A
is less theoretical, less hypothetical, assumes less than B. If A is the
statement
Macbeth is getting a view of a dagger
and B is the statement
Macbeth is seeing a dagger
then A is epistemically prior to B. Macbeth was in no doubt about A,
but he was in serious doubt about B and his doubts were confirmed when
he tried to touch the dagger but failed to get a feel of it. Again, if A is the
statement
Houses have windows so that people inside can see things
and B is the statement
Houses have windows in order to let the light in
then A is epistemically prior to B.
We not only say that Columbus discovered America, but also that J. J.
Thomson discovered electrons. In doing so we are clearly using the word
'discovered' in two distinct senses. What J. J. Thomson discovered in the
first sense was what we may expect to observe when an electrical discharge
is passed through a rarified gas. He then introduced the word 'electron'
into the language of physics in order to formulate a hypothesis from
which would follow the generalizations of his discoveries concerning
rarified gases. It will help to distinguish the two kinds of discoveries if we
call statements which are generalizations from accessible sets to inac-
cessible sets inductive hypotheses, and statements which are introduced in
STUDIES IN THE FOUNDATIONS OF GENETICS 413
order to have such hypotheses among their logical consequences ex-
planatory hypotheses. Then we can say that to every explanatory hypo-
thesis Si there is at least one inductive hypothesis £2 such that £2 is a
consequence of Si (or of Si in conjunction with other hypotheses) and is
epistemically prior to it. Were this not so Si would not be testable. But,
as we shall see later, it is also possible to have an explanatory hypothesis
Ss and an inductive hypothesis S2, which is not a consequence of SB
although it is epistemically prior to it, "both of which are consequences of the
same explanatory hypothesis Si. If what you want to say can be expressed
just as well by a statement A as by a statement B then, if A is epistemically
prior to B, it will (if no other considerations are involved) be better to
use A. In what follows I shall try to formulate all the statements con-
cerned in the highest available epistemic priority. Statements concerning
parents and offspring only are epistemically prior to statements which also
speak about gametes and zygotes; and statements about gametes and
zygotes are epistemically prior to statements which speak also about the
parts of gametes and zygotes. The further we go from the epistemically
prior inductive hypotheses the more we are taking for granted and the
greater the possibility of error. The following discussion of Mendel's First
Law will be in terms of parents, offspring, gametes, zygotes and en-
vironments.
The foregoing remarks may now be illustrated by a brief reference to
Mendel's actual experiments. Suppose X and Y are accessible sets of
parents. Let us denote the set of all the offspring of these parents which
develop in environments belonging to the set E by
MX, Y)
If all members of X resemble one another in some respect (other than
merely all being members of X) and all members of Y resemble one
another is some other respect (also other than merely all being members of
Y), so that the respect in which members of X resemble one another is
distinct from that in which members of Y resemble one another, then
JE(X, Y) constitutes an accessible set of hybrids. We also need/#2(X, Y)
which is defined as follows:
JE*(X, Y) = /*(/*(*, Y),f*(X, Y))
Mendel experimented with seven pairs of mutually exclusive accessible
sets and the hybrids obtained by crossing them. It will suffice if we
consider one pair. Let 'A' denote the pea plants with which Mendel began
414 J. H. WOODGER
his experiments and which were tall in the sense of being about six feet
high ; and let 'C' denote the peas which he used and which were dwarf in
the sense of being only about one foot high. Let us use T' to denote the
inaccessible set of all tall pea plants and 'D' to denote the inaccessible set
of all dwarf pea plants. Thus we have
A C T and C C D
let us use 'B' to denote the set of all environments in which Mendel's
peas developed. Mendel first tested his As and Cs to discover whether
they bred true and found that they did because
/B(A, A) C T and /B2(A, A) C T
/B(C, C) C D and /B2(C, C) C D
He next produced hybrids and reported that
/B(A,C)CT
Finally, he took 100 of the tall members of /B2(A, C) and self fertilized
them. From 28 he obtained only tall plants and from 72 he obtained some
tall and some dwarf. This indicated that about one third of the tall plants
of /B2(A, C) were pure breeding tails like/B(A, A) and two thirds were like
the hybrid tails or/B(A, C).
Closely similar results were obtained in the other six experiments,
although the respects in which the plants differed were in those cases not
concerned with height but with colour or form of seed or pod or the
position of the flowers on the stem. In each case the hybrids all resembled
only one of the parental types, which Mendel accordingly called the
dominant one. The parental type which was not represented in the first
hybrid generation, but which reappeared in the second, he called the
recessive one. Mendel took the average of the seven experiments and sums
up as follows:
If now the results of the whole of the experiments be brought together, there is
found, as between the number of forms with the dominant and recessive charac-
ters, an average ratio of 2.98 to 1, or 3 to 1.
So long as we assert that the average ratio is 2.98 to 1 we are dealing with
accessible sets and have no law or explanatory hypothesis. But what does
STUDIES IN THE FOUNDATIONS OF GENETICS 415
Mendel's addition 'or 3 to 1' mean? Presumably these few words express
the leap from an observed proportion in an accessible set to a hypotheti-
cal proportion in an inaccessible set. This represents Mendel's discovery as
opposed to his observations. At the same time there is no proposal to
extend this beyond garden peas. This extension was done by Mendel's
successors who, on the basis of many observations, extended his gener-
alization regarding the proportions of kinds of offspring of hybrids over
a wide range of inaccessible sets not only of plants but also of animals. In
addition to this Mendel also left us his explanatory hypothesis, the
hypothesis namely that the hybrids produce gametes of two kinds — one
resembling the gametes produced by the pure dominant parents, and the
other resembling those produced by the recessive parents. He also assumed
that these two kinds of gametes were produced in equal numbers. We have
now to consider what is the minimum theoretical basis for deriving this
hypothesis as a theorem in an axiom system.
A GENETICAL AXIOM SYSTEM
(In what follows the axiom system is given in the symbolic notation of set-
theory, sentential calculus and the necessary biological functors (the last in
bold-face type). Accompanying this is a running commentary in words in-
tended to assist the reading of the system ; but it must be understood that this
commentary forms no part of the system itself.)
The following primitives suffice for the construction of a genetical
axiom system expressed on the level of epistemic priority here adopted;
for cyto-genetics (and even perhaps for extending the present system)
additional primitives are necessary.
(i) 'uFx' for 'u is a gamete which fuses with another gamete to form the
zygote (fertilized egg) % .
(ii) 'dlz xyz' for 'x is a zygote which develops in the environment y into
the life z.'
(iii) 'u gam z for '« is a gamete produced by the life z.'
(iv) cJ is the class of all male gametes,
(v) $ is the class of all female gametes,
(vi) 'phen' is an abbreviation for 'phenotype'
The following postulates are needed for the derivation of the theorems
which are to follow:
POSTULATE 1 (u)(v)(x):uFx.vFx.u+v.D.~(3w).wFx.w=\=-u.w^=v
416 J. H. WOODGER
This asserts that not more than two gametes fuse to form each zygote.
POSTULATE 2 (x) (w ) : (Eu) . uFx . uFw . D . # — w
This asserts that if a gamete unites with another to form a zygote then
there is no other zygote for which this is true.
POSTULATE 3 (u)(v)(x):.uFx.vFx.u^=v.D\u e g.v e ?.v .u E ?.v e cj
This asserts that of the two gametes which unite to form any zygote one is
a male gamete and the other a female gamete.
POSTULATE 4 c? r> $ = A
This asserts that no gamete is both male and female.
POSTULATE 5 (x) (y) (z) (u) (v) \dlz xyz . dlz uvz . D . x = y . u = v
This asserts that every life develops in one and only one environment from
one and only one zygote.
POSTULATE 6 (x) (y) (z) (*')(/) (z') 'dlz xyz . dlz x'y'z' .
.D.x = xf
This asserts that if there is a gamete produced by a life z and the same
gamete is produced by a life z' then the zygote from which z develops is
identical with the zygote from which z1 develops. This may seem strange
until it is explained that by 'a life' is here meant something with a be-
ginning and an end in time and a fixed time extent. The expression is thus
being used in a way somewhat similar to the way in which it is used in
connexion with lite insurance. Suppose a zygote is formed at midnight on
a certain day ; suppose it develops for say ten days and on that day death
occurs. Then the whole time-extended object of ten days duration 'from
fertilization to funeral' is a life which is complete in time. But suppose we
are only concerned with what happens during the first ten hours', then
that also is a life, in the sense in which the word is here used, and one
which is a proper part of the former one. Now if a gamete is said to be
produced by the shorter life it is also produced by the longer one of which
that shorter one is a part ; we cannot identify the two lives but we can say
that they both develop from the same zygote. As here understood the
time-length of a life fixes its environment ; because the environment of a
life is the sphere and its contents which has the zygote from which de-
velopment begins as its centre and a radius which is equal in light-years to
STUDIES IN THE FOUNDATIONS OF GENETICS 417
the length of the life in years. But no time-metric is needed for the present
system and many complications are therefore avoided.
All the primitive notions of this system are either relations between
individuals or are classes of individuals. But the statements of genetics
with which we are concerned in what follows do not speak of individual
lives, individual environments, individual zygotes or individual gametes
but of classes of individuals and of relations between such classes. But
the classes we require are definable by means of the primitives.
DEFINITION 1 x e I7(a, /?).=: (3u) (3v) .we OL.V e p.u^v. uFx . vFx
We thus use ' £/(a, ft) ' to denote the class of all zygotes which are formed by
the union of a gamete belonging to the class a with one belonging to the
class /?.
DEFINITION 2 z e LE(%) . = : (3x) (3y) . dlz xyz . x e Z . y e E
'LE(Zy is used to denote the class of all lives which develop from a zygote
belonging to the class Z in an environment belonging to the class E.
DEFINITION 3 HE G#(X) . ^:(Bx)(3y)(3z) .dlz xyz.y e E .z e X .ugam z
'G^(X)' thus denotes the class of all gametes which are produced by lives
belonging to the class X when they develop in environments belonging to
the class E.
DEFINITION 4 z e FilK,M,E(X, Y).=: (3x] (3y) (3u) (3v) . u e GK(X] .
v e GM(Y).uFx.vFx. u^v. y e E. dlz xyz
The letters 'Fit' are taken from the word 'filial'. The above definition
provides a notation for the class of all offspring which develop in environ-
ments belonging to the class E and having one parent belonging to the
class X and developing in an environment belonging to the class K and
the other parent belonging to the class Y and developing in an environ-
ment belonging to the class M. For Mendelian contexts only one en-
vironmental class need be considered ; provision for this simplification is
made below.
The above four definitions suffice for most purposes. But it frequently
happens that we need to substitute one of the above expressions for the
variables of another and in that way very complicated expressions may
arise. In order to avoid this the following abbreviations are introduced by
418 J. H. WOODGER
definition :
DEFINITION 5 D(a, ft, E) = LE(U(oi, ft))
DEFINITION 6 G(«, ft, E) = GE(LE(U(*, ft)))
DEFINITION 7 F'K,M,E(*> P'> 7> *) = FilK,MtB(D(*, ft, K), D(y, 6, M))
DEFINITION 8 F*(a, ft;y,d) = F'E)E,E(<*, ft ',?,$)
All the foregoing notions are general an^i familiar ones. We must now
turn to some of a more special and novel kind. If our present inquiry were
not confined to the single topic of Mendel's First Law we should at this
stage introduce the notion of a genetical system, and we should maintain
that genetical systems as then intended constitute the proper objects of
genetical investigations. But for the present purpose it suffices if we speak
of a specially simple kind of genetical system which we shall call genetical
units. A genetical unit is a set of three classes : one is a phenotype, another
is a class of gametes and the third is a class of environments ; — provided
certain conditions are satisfied. Suppose {P, a, E} is a candidate for the
title of genetical unit ; then it must be development ally closed, that is to say
D(a, a, E) must be a non-empty class and it must be included in the
phenotype P; next it must be genetically closed, that is to say G(a, a, E)
must be non-empty and must be included in a. Thus neither the process of
development nor that of gamete-formation takes us out of the system;
it thus 'breeds true'. The official definition is:
DEFINITION 9 Segenunit =s:(3/>)(3a)(3E) .P ephen.S = {P, a, E] .
D(a, a, £)^A.D(a, a, £)CP.G(a, a,
G(a, a, £)Ca
The genetical systems with which Mendel worked were genetical units,
sums of two genetical units and what may be called set-by-set products
of such sums. Thus if {P, a, E} and {Q, ft, E} are genetical units with the
phenotype P dominant to the phenotype Q we shall have D(a, ft, E)^A
and D(«, ft, E) C P, so that {P, Q, a, ft, E}, the sum of the two units, is
developmentally closed; if we also have G(a, ft, E) 4= A and G(a, ft, E)C.
a w ft, then the sum is also genetically closed. As we shall see shortly these
assumptions do not suffice to enable us to infer that the sum will behave
according to the Mendelian generalizations. If {R, y, E} and {S, d, E} are
two more genetical units so that {R, S, y, d, E} is their sum, then the
STUDIES IN THE FOUNDATIONS OF GENETICS 419
set-by-set product of this sum and the former one will be
{P r> R, Q r» R, P n S, Q o 5, a n y, ft n y, a n (5, 0 ^ d, E}
and if it is development ally and genetically closed this will constitute yet
another type of genetical system which was studied by Mendel and with
which his Second Law was concerned.
Before we can proceed with the biological part of our system we must
now say something about the set-theoretical framework within which it is
being formulated and on the basis of which proofs of theorems are carried
out. We begin with two important definitions, one of which has already
been mentioned. (The definitions and theorems of this part of the system
will have Roman numerals assigned to them in order to distinguish them
from biological definitions and theorems).
DEFINITION I X e pY . == . — ( — - — - = p . 0 < p ^ 1
pY is thus the set of all classes which have a proportion p of their members
belonging to Y. N(X) is the cardinal number of the class X.
DEFINITION II Ze[X, Y] = : (3u)(3v).u eX.v e Y.u 4= v.Z ={ u, v]
[X, Y] is the pair-set of the classes X and Y, that is to say it is the set of
all pairs (unordered) having one member belonging to the class X and the
other to the class Y.
No attempt is made here to present the set-theoretical background
axiomatically. We simply list, for reference purposes, the following
theorems which can be proved within (finite) set theory and arithmetic.
THEOREM I N(X) = O.-.X = A
THEOREM II X C Y.D.N(X) < N(Y)
THEOREM III N(X u Y) = N(X ^ Y) + N(X ^ Y) + N(JP ^ Y)
THEOREM IV X * Y = A.D.N(X w Y) = N(X) + N(Y)
THEOREM V N([X, Y]) = N(X * Y).N(X ^ Y) + N(X ^ Y) .
[N(X o 7) + N(X r* Y) + N(X n Y) — 1]
THEOREM VI X^ Y = A.D.N([X, Y]) = N(X).N(Y)
THEOREM VII X 3=A.XC Y .^.XelY
420 J. H. WOODGER
THEOREM VIII XepY^qY.D. =
THEOREM IX X =f= A.D. (3/») .
>
THEOREM X X + A.XC YvZ.Y^Z = A.==
XepY ^ (1
THEOREM XI X ^ y = A.D.(pX n ?y) C (£ + ?)(X w Y)
THEOREM XII Y CP.Z CQ.P^ Q = A.D.(pY ^ (1 - p)Z)
THEOREM XIII Y CA.Z C B.W CC .A ^B = B^C = C^A
= A.D.(pY nqZn (1 - p — q)W) C (pA n qB ^ (1 - p
THEOREM XIV ^CP.J5CP.CC(2./lo^=5^C=Co^ = P^(;=A.
(pAr,qBr,(\-p- q}C}C(p + q)P o (1 - ^
THEOREM XV X^y = a^/? = A..X <=an(\—)p.
THEOREM XVI X n y -= a ^ /? = A.D: X e Ja n ^ . Y e 'a ^ \p. - .
We can now return to the biological part of our system. In genetical
statements the notion of randomness frequently occurs. It will be re-
quired in two places in the present context. In both of these it means
persistence of certain relative frequencies during a process. It means the
absence of selection or favouritism.
We shall say that a set 5 which is the sum of two genetical units is
random with respect to U or that the union of the gametes is random in S if
and only if X and Y being any classes of gametes of the form G(a, /?, E),
a and ft being any gamete classes and E the environment class of S,
whenever we have
[X,Y]EP[y,<5]
we also have U(X, Y) e pU(y, 6)
y and d also being gamete-classes of 5. The following definition covers
STUDIES IN THE FOUNDATIONS OF GENETICS 421
cases where S has an additional phenotype because there is no dominance.
DEFINITION 10 S e rand U. ^:(*)(p)(y)(d)(C)(0):(E)(p)(3Si)(3S2):
Si, $2 e genunit . S = Si w $2 . v . (3R) . R e phen .
S = Si w S2 v [R}.«, ft y, (5, f, 0, £ e S.[G(a, ft £),
G(y, <$,£)] e#£,0].D.
V(G(*,p,E), G(y, d,E))epU(£,0)
Analogously we can say that such a set S is random with respect to D(E)
or that development in members of the environment class E of S is random
if and only if, whenever we have
U(G(*,p,E),G(y,d,E))epU(e90)
we also have
D(G(a, ft E), G(y, 6, E), E) e pD(t, 0, E)
the Greek letters all being variables whose values are the gamete classes
of Si and 'E' being a variable whose single value (in Mendelian cases) is the
environmental class of S.
DEFINITION 11 S e rand D(£) . - : . (*)(P)(y)(d)(£)(0)(E)(p) :(3Si)(3S2) :
Si,S2egenunit.S=SivS2.y.(3R).Rephen.
S=SivS2u{R}.*, ft y, 6, f, 6), £ e S. l/(G(a, ft £),
G(y, (5, £)) 6/)l/(f, 0):D.D(G(a, ft £), G(y, d, £), £) 6
pD(t, 0, E)
We now give a list of biological theorems which are provable from the
postulates and definitions and are used in the proofs of the major theorems
to follow. On the right hand side of each theorem are indicated the
postulates (P), definitions (D) or theorems (T) required for its proof.
THEOREM 1 U(a, ft = l/(ft a) [Dl.
THEOREM 2 U(X, X) = U(X*3, X*$) [Dl, P3.
THEOREM 3 ao£ = A.D. I7(a, a)ol/(ft ft) = A [PI, Dl.
THEOREM 4 ar»^ = A.D. l/(a, a)ol/(a, ft = A [PI, Dl.
THEOREM 5 E^K=A. y.Zr>W=A :D . LE(Z)^LK (W) = A [D2, P2.
422 J. H. WOODGER
THEOREM 6 Ufa a) ^ U(ft, ft) =
= A.D.D(a, a, E)*D(ft, ft, E) = A [D5, D2, P2.
THEOREM 7 Ufa a) o U(«, 0) =
= A.D.D(a, a, E) ^ D(a, ft, E) = A [D5, T5.
THEOREM 8 a ^ /9 = A . D . />(«, a, E) n D(0, /?, E) =-- A [T3, T6.
THEOREM 9 <*nft = A.D.D(a, a, E) ^ D(a, ft,E)=A [T4, T7
THEOREM 10 D(X ^ <J, X * $, £) = D(X, X, E) [D5, T2.
THEOREM 11 a ^ /J = A . D . G(a, ft £) o G(/5, ft,E)=A [D6, D3, D2,
P5, P6, T4.
THEOREM 12 G(«, a, £) C a.D.
D(G(a, a, £), G(a, a, £), E)CDfa a, £) [Dl, D2, D5.
THEOREM 13 FEfa ft] y, d) =
D(Gfa ft, E), G(y, 6, E), E) [D8, D7, D4, D5, D6, Dl, D2
THEOREM 14 {P, <*,E}egenunit.D.FEfa*', a, a) CP [T13, D9, T12.
By a mating description is meant a statement of the form X C Y or
X E pY where 'X' is an expression denoting a set of offspring, e.g.
'Fj0(a, ft] a, ft)' and 'V denotes a phenotype. We turn now to the task of
discovering what must be assumed in order to derive the characteristic
Mendelian mating descriptions, beginning with that which asserts the
relative frequencies of dominants and recessives in the offspring of
hybrids when these are mated with one another. For reference purposes
it will be convenient if we use abbreviations for groups of the various
separate hypotheses which enter into the antecedents of the following
theorems. Let us therefore put :
H 1 . for: {P, «, E}, {<?, ft, E} e genunits . P ^Q = «^ft = A ({P, a, E}
and {Q, ft, E} are genetical units and P and Q, and a and ft, are
mutually exclusive)
H2. for: Dfaft,E)CP
(the hybrids are included in the phenotype P)
H3. for: (3R).R Ephen.Dfa ft, E) CR
(this covers cases where there is no dominance but the hybrids are
STUDIES IN THE FOUNDATIONS OF GENETICS 423
included in a third phenotype R)
H 4a. for: G(a, ft, E) n $ e |a o ^. G(a, ft £) /^ $ e Ja ^£0
(This is one form of Mendel's own hypothesis. He assumed that in
the gametes of the hybrids the two kinds occured in equal numbers
both in the case of male and in the case of female gametes. Theorem
XVI shows that the above form is equivalent to this).
H 4b. for: G(a, ft E) n £ =J= A.G(a, ft E) n cJ C a w ft
G(a, ft £) n ? 4= A . G(a, ft £) n ? C a w 0
(This is a weaker form of H 4a because it only assumes non-
emptyness and inclusion).
H Ac. for: G(a, ft £) 4= A. G(a, ft £) C a w 0
(This is weaker still because it does not make separate statements
regarding the gametes of different sex).
H 5. for : S = {P, a, E} w {0, ft £} . 5 e rand D ^ rand U(E)
(This is the hypothesis that the system in question is the sum of
two genetical units (H 1 ) and is random both with respect to the
union of the gametes and also with respect to the development of
the resulting zygotes in the environments belonging to E.
H 5a. for : S = [P, a, E} v {Q, ft E} w {R} and S e rand V n rand D(E)
(This is to cover the cases when there is no dominance).
The following theorems are asserted for all values of the variables P,
Q, R, a, ft, E.
THEOREM 15 states that if we have H 1, H 2, H 4a and H 5 we also
have three quarters of the offspring of the hybrids belonging to the domi-
nant and the remaining quarter to the recessive phenotype.
THEOREM 15 H 1 .H 2.H 4a.H 5.D.F*(a, ft', a, ft) e |P n \Q
In order to make all the steps explicit we give the following derivation of
this theorem :
(1) Using 'X' as an abbreviation of 'G(a, ft, E)' we have, by H 1 and P 4
(X n £) n (X r* ?) = a n ft = A
(2) By (1), H 4a and T XV we can write:
[X* c?, X~ $] e 44[a, a] ^ 2(i4)[a, ft] * £.£[/?, ft]
424 J. H. WOODGER
(3) From (2), H 5 and D 10 we are now able to obtain:
V(X ~ & X « $) E ttf(«, a) ~ itf(a, /?) ^ itf(ft /J)
(4) We next obtain from (3), H 5 and D 1 1 :
D(X ^ J, X^ $,£)<= JD(«, a, E) n JD(a, ft E) r> JD(ft /J, E)
(5) From H 1 , D 9 and H 2 we have :
D(a, a, E)CP and D(a, ft E)CP and D(ft ft E)C()
(6) From H 1 we have oc ^ ft = A and so with the help of T 8 and T 9
we get :
D(a, a, E) n D(a, ft E) = D(a, ft £) n D(/5, ft £) -
= D(ft ft £) n D(a, a, £) = A
(7) From (5) and (6) with the help of T XIV we now get:
lD(af a, E) ^ * (D(a, ft E) n JD(/J, ft E) C |P o J0
(8) By T 10 we have:
D(X n (?, X n ?, E) - D(X, X, E)
(9) From (4), (7) and (8) we obtain:
(10) Putting 'G(a, ft E)' for 'X' in (9) in accordance with (1) :
D(G(a, ft E), G(a, ft E), e |P n J(?
(11) By substitution of 'a' for '/ and '/T for '6' in T 13 we get:
FB(a, j8; a, ft) = D(G(a, /J, £). G(a, /», £), £)
(12) Finally from (10) and (11) we obtain the required result:
**(«,/? ;«,0efi^i0
Before commenting on this we shall give the remaining theorems.
THEOREM 16 is concerned with the offspring of hybrids when mated
with the recessive parents ; a mating type commonly called a back-cross.
It is stated here in a somewhat unusual from and with the weakest
possible antecedent. It states that if the hypotheses H 1 , H 2, H 4c and
H 5 are adopted then we should expect the proportions of the two pheno-
STUDIES IN THE FOUNDATIONS OF GENETICS 425
types in the offspring to be identical with the proportions of the two
kinds of gametes in the gametes produced by the hybrids. If, therefore,
we assume, on the basis of samples, that Fs(p, p', a, /?) e \P r\ ±Q we must
also assume that G(a, /?, E) e £a ^ J/5. The first of these hypotheses is
epistemically prior to the second and yet they both occur together in the
consequent of this theorem.
THEOREM 16 H 1 .H 2.H 4c.H 5.D.(3p) .FE,(p, ft; <*, p) epP^(\—p)Q.
G(<*,p,E)epccr> (1 — p)p
The derivation of Theorem 16 requires: T 13, T 1 1, T X, T XV, T VII,
D9D 10, D 11 and T XII.
In the next theorem we have the same antecedent as in Theorem 15
except that nothing is assumed about the relative proportions of the two
kinds of gamete in the gametes produced by the hybrids.
THEOREM 17 H 1 .H 2.H 4b.H 5.D.(3p)(3q).FE(*. p', a, p) e
G(x,p,E) ^ cJe^arN (1 — p)p.G(*t p, E)
In this case, if we assume, as a result of sampling, that (p—pq+q) = |
and (1 — p}(\ — q) — J we cannot determine the value of p and of q. But
if p has first been ascertained with the help of THEOREM 16 and sampling
then (at least when p = q) the result can be applied to THEOREM 17. For
the derivation of this theorem we require T X, P 4, T XV, D 10, D 1 1,
T 2, D 5, D 9, T 9, T 8, T 13, T XIV. The next next theorem is the theo-
rem corresponding to THEOREM 17 in systems where there is no dominance.
THEOREM 18 H 1 .H 4b.H 5a.D.(3#)(3?).F*(a, p\ a, p)
o c?e£a^ (1 — p)p.G(a,p,E) n ? e ?a o (1 — q)p
In this case, if on the basis of sampling we assign a value to pq and to
(p(\ — q) + q(l — p)), then we can determine the values of p and q. The
theorem requires: T X, P 4, T XV, D 10, Dll, T 2, D 5, T 9, T 8,
T XIII, T 13.
Finally a theorem will be given which might have been known to
Mendel. It is an example of a system which includes only one genetical
unit. Suppose F and M are the females and males respectively of some
species, suppose further that g and h are two mutually exclusive classes
426 J. H. WOODGER
of gametes and H a class of environments all satisfying the following
conditions: (i) {F, g, H} is a genetical unit; (ii) D(g, h, H)=+=A.
D(g, h, H)CM.G(g, h, H) * A.G(g, h,H)Cgvh. (iii) D(h, h, H) = A
(therefore {M, h, H} is not a genetical unit) ; (iv) 5 = {F, M, g, h, H} and
S is rand U r» rand D(H). If these conditions are satisfied we shall
have:
. g; g, fy e |F o \M if and only if G(g, h, H) e Jg n \h
THEOREM 19 F^M= g r» h = A.{F, g, H{e genunit.D(g, h, H)4=A.
D(g, h, H)CM.G(g, h, H)+A.G(g, h, H)Cgvh.S =
= {F, M, g, h, H}.5 e rand U o rand D(H) .D.
g; g, fc) e JF ^M. ^ . G(g, h, H) e Jg o ifc
This theorem requires for its derivation T 1 1, T XV, T VII, D 10, D 1 1,
TXII, T 13.
We can now see clearly what was Mendel's discovery in the Christopher
Columbus sense and what was his discovery in the J. J. Thomson sense
distinguished above. His discovery in the first sense (inductive hypothe-
sis) was the f P n \Q frequencies in the offspring when hybrids are mated,
if this is understood as being asserted (as above) for inaccessible sets.
This is expressed in THEOREM 15. Mendel's discovery in the second sense
(explanatory hypothesis) is the hypothesis that is expressed in H 4a. But
we have seen that in this form it is unnecessary. The much weaker form
of H 4c suffices, especially if we begin with T 1 6 and then, using its
results with the value of p determined by sampling (coupled with the
additional hypothesis: p = q), we pass to T 17. Where there is no do-
minance (in Mendel's experiments one phenotype is in each case dominant
to the other) p and q can be determined independently of T 16 with the
help of T 18. Thus the convenient minimum assumption is H 4b. It could
be argued that the assumption of two kinds among the gametes of
hybrids is not so much a discovery of the second kind as a special appli-
cation of a general causal principle to embryology and genetics. But this
does not mean that it cannot be discussed.
It is often said that Mendel discovered what is called particulate
inheritance. But, except in the sense in which gametes are particles,
Mendel did not specifically speak of particles. Strictly speaking a hypothe-
sis involving cell parts only becomes important when we consider the
STUDIES IN THE FOUNDATIONS OF GENETICS 427
breakdown of Mendel's Second Law. The whole of Mendel's work can be
expressed with the help of D(a, ft, E), G(a, p, E) and FE(*. ff'.y.d) and
thus in terms of gamete and environment classes, the classes of zygotes
which can be formed with them and the classes of lives which develop
from the zygotes in the environments.
The above analysis has shown the central role which is played by the
hypotheses of random union of the gametes and of random development
in obtaining the Mendelian ratios (see especially steps (2), (3) and (4) in
the proof of Theorem 15). These do not receive the attention they deserve
in genetical books. Sometimes they are not even mentioned. This is
particularly true of the hypothesis of random development. That Mendel
was aware of it is clear from the following passage in the translation from
which we have already quoted (p. 340) :
A perfect agreement in the numerical relations was, however, not to be ex-
pected, since in each fertilization, even in normal cases, some egg cells remain
undeveloped and subsequently die, and many even of the well-formed seeds
fail to germinate when sown.
In addition to the special hypotheses H 1 to H 5 there are also the six
postulates to be taken into consideration. Any departure from these
could affect the result. This provides plenty of scope for reflexion. But
perhaps the most striking feature of the Mendelian systems is the fact
that only one class of environments is involved and is usually not even
mentioned. Some interesting discoveries may await the investigation of
multi-environmental systems. Provision for this is made in Definitions
4, 7 and 1 1 and a variable having classes of environments as its values
accompanies all the above biological functors. At the same time attention
should be drawn to the fact that no provision is made, either here or in
current practice, for mentioning the environments of the gametes. And
yet it is not difficult to imagine situations in which the necessity for this
might arise.
It will be noticed that no use has here been made of the words 'proba-
bility', 'chance', or 'independent', although these words are frequently
used in genetical books with very inadequate explanation. Here the term
'random' has been used but its two uses have been explained in detail.
In passing it may be mentioned that 'S is random with respect to FE is
also definable along analogous lines and then the Pearson-Hardy law
is derivable.
In conclusion I should like to draw attention to the way in which the
428 J. H. WOODGER
foregoing analysis throws into relief the genius of Mendel, which enabled
him to see his way so clearly through such a complicated situation. I also
wish to express my thanks to Professor John Gregg of Duke University
and to my son Mr Michael Woodger of the National Physical Laboratory
for their help in the preparation of this article.
Symposium on the Axiomatic Method
AXIOMATIZING A SCIENTIFIC SYSTEM BY AXIOMS IN
THE FORM OF IDENTIFICATIONS
R. B. BRAITHWAITE
University of Cambridge, Cambridge, England
A scientific deductive system ("scientific theory") is a set ot propo-
sitions in which each proposition is either one of a set of initial propositions
(a "highest-level hypothesis") or a deduced proposition (a "lower-level
hypothesis") which is deduced from the set of initial propositions ac-
cording to logico-mathematical principles of deduction, and in which some
(or all) of the propositions of the system are propositions exclusively
about observable concepts (properties or relations) and are directly
testable against experience. In this paper these testable propositions will
be taken to be empirical generalizations of the form Every A -specimen is
a /^-specimen, whose empirical testability consists in the fact that such a
proposition is to be rejected if an A -specimen which is not a /^-specimen
is observed. (Statistical generalizations of the form The probability of an
A -specimen being a /^-specimen is p can be brought within the treatment ;
here testability depends upon more sophisticated rejection rules in terms
of the proportions of /^-specimens in observed samples of ^4 -specimens.)
The object of constructing a scientific theory is to 'explain* empirical
generalizations by deducing them from higher-level hypotheses.
A scientific deductive system will make use of a basic logic independent
of the system to provide its principles of deduction. It will be convenient
to assume that this basic logic includes all the deductive principles of the
system, so that none of these are specific to the system itself and the
deductive power of the system will be given by the addition to the basic
logic of the system's set of initial propositions. The system can then be
expressed by a formal axiomatic system (called here a calculus) in which
the axioms (the "initial formulae") fall into two sets, one set consisting
of those axioms required for the basic logic of the system (which set will
be empty if the basic logic has no axioms) — no axiom of this set will
contain any extra-logical constants — and another set of axioms con-
taining non- vacuously extra-logical constants (Tarski's proper axioms
[10, p. 306]) corresponding, one to one, to the set of initial propositions
429
430 R. B. BRAITHWAITE
of the scientific system. The rules of derivation of the calculus will then
correspond to the deductive principles of the basic logic. Since we are not
concerned with the nature of this basic logic we shall ignore the axioms
and theorems of the calculus which forms a sub-calculus representing the
basic logic and shall only be interested in proper axioms and proper
theorems (i.e. those which contain non-vacuously extra-logical constants,
which will be called primitive terms) and which are interpreted as re-
presenting the propositions of the scientific system. The theorems (or
axioms) representing the directly testable propositions will be called
testable theorems (or axioms), and the primitive terms occurring in these
theorems observable terms.
The problem raised by scientific deductive systems for the philosophy
(or logic or semantics) of science is to understand how the calculus is inter-
preted as expressing the system. If all the proper axioms are testable
axioms, and consequently all the proper theorems are testable theorems,
there is no difficulty, since all the extra-logical terms (i.e. primitive terms)
occurring in the calculus are observable terms so that all the proper
axioms and theorems can be interpreted as propositions directly testable
by experience. The semantic rules for the interpretation of the calculus by
means of direct testability apply equally to all the proper axioms and
theorems ; so the calculus can be interpreted all in a piece.
But the situation is different for the deductive system of a more ad-
vanced science which makes use in its initial propositions of concepts (call
them theoretical concepts) which are not directly observable, so that the
propositions containing these are not directly testable. Here the axioms
of the calculus contain primitive terms which are not observable terms,
and these theoretical terms have to be given an interpretation not by a
semantic rule concerning direct testability but by the fact that testable
theorems are derivable from them in the calculus. The calculus is thus
interpreted from the bottom upwards : the testable theorems are interpreted
by a semantic rule of direct testability, and the other theorems and axioms
are then interpreted syntactic-osemantically by their syntactic relations
to the testable theorems. Theoretical terms are not definable by means
of observable concepts — the 'reductionist' programme of thorough-
going logical constructionists and operationalists cannot profitably be
applied to the theoretical concepts of a science — though they may be
said to be implicitly defined by virtue of their place in a calculus which
contains testable theorems. The empirical interpretation of the calculus
is thus given by a directly empirical interpretation of the testable axioms
AXIOMS IN THE FORM OF IDENTIFICATIONS 431
and theorems and an indirectly empirical interpretation of the remainder.
(For all this see R. B. Braithwaite [3, Chapter III].)
In order that a calculus containing theoretical terms should be able to
be interpreted in this indirectly empirical way, it is necessary that each
of the observable terms should occur in at least one of the proper axioms.
These may be divided into three categories: (1) Testable axioms whose
primitive terms are all observable terms; (2) Axioms whose primitive
terms are all theoretical terms : these will be called Campbellian axioms,
since collectively they represent N. R. Campbell's "hypothesis" con-
sisting of "statements about some collection of ideas which are character-
istic of the [scientific] theory" ([4], p. 122), and the highest-level hy-
potheses represented by them will be called Campbellian hypotheses',
(3) Axioms whose primitive terms are both observable terms and theo-
retical terms : these will be called dictionary axioms, since they correspond
to Campbell's "dictionary". Since no philosophical problems arise in
connexion with testable axioms, we will suppose that there are no testable
axioms in the calculus, so that no direct empirical interpretation is
possible at the axiom level. To simplify our discussion we will further
suppose that each dictionary axiom is of the form of an identity
where a is an observable term standing alone on the left-hand side of the
identity with the right-hand side containing only theoretical terms Ai, A£,
etc. as primitive terms. Dictionary axioms in this form will be called
identificatory axioms, since they may be said to 'identify' an observable
term by means of theoretical terms. In order that these identificatory
axioms should be able to function in a calculus to be interpreted as a
scientific system, the basic logic governing the identity sign will be
assumed to be strong enough to permit the derivation from an axiom of
the form a = (.... AI . . ^2 . . . . ) of every theorem obtained by substi-
tuting a for ( ---- AI. .^2. . . .) at any place in any axiom or theorem in
which ( . . . . AI . . ^2 . . . . ) occurs.
The simplified calculi to be considered will thus contain, as proper
axioms, Campbellian axioms concerned with the theoretical terms of the
scientific calculus and identificatory axioms relating the observable terms
of the calculus to the theoretical terms by identifying each of the former
with a logical function of the latter. If a is an w-ary predicate, an identi-
ficatory axiom a = ( ---- AI. .^2 ---- ) will, with a suitable basic logic,
432 R. B. BRAITHWAITE
permit the derivation from this axiom of
). ..(xn)(a(xi, x2, ...,xn) = Q(XI, x2, . . .,*»)),
where Q is an abbreviation for (. . . .fa. .fa- • • •), so that a will be de-
finable with respect to the identificatory axiom (together with the basic
logic) in terms of fa, fa, etc. in E. W. Beth's sense of "definable" [(2], p
335). (In [3, p. 57] I called sentences of the form a = ( ____ fa. . fa ____ )
definitory formulae', but I now prefer to call them identificatory axioms
(or theorems) and to reserve the word "definition" and its cognates for
notions which are semantical and not purely syntatical.)
Most axiomatizations of a scientific theory contain Campbellian axioms
among their proper axioms. Philosophers of science frequently think that
it is the Campbellian axioms representing the Campbellian hypotheses
which express the essence of the theory, the dictionary axioms (which in
the simplest cases are identificatory axioms) having the function of
'semantical rules' or 'co-ordinating definitions' or 'definitory stipulations'
relating the observable terms to the theoretical terms. Thus there would
be an absolute distinction between Campbellian and dictionary axioms.
It would follow from this point of view that a calculus which makes use
of theoretical terms must include Campbellian axioms if it is to be inter-
preted to express what is of importance in the scientific theory. This,
however, is not the case. Calculi whose proper axioms are all identificatory
can serve to express empirical deductive systems : indeed, given a calculus
which contains Campbellian axioms, it is sometimes possible to construct
another calculus having the same theoretical terms whose proper axioms
are all identificatory which is testably equivalent to the first calculus in the
sense that the testable theorems of the two calculi are exactly the same.
This will always be the case if the basic logic of the calculus is simple
enough. We will consider the case in which the basic logic is merely that of
prepositional logic combined with that of the first-order monadic predi-
cate calculus with identity (and with a finite number of predicates). This
basic logic is also that of finite Boolean lattices, and it will be convenient
to regard it as expressed by a calculus (called a Boolean calcuhis) whose
logical constants are, besides those of the prepositional calculus, constants
whose class interpretation is union (J)t intersection (o), complementation
('), the universe class (e), the null class (o) ,class inclusion (C) and class
identity (=). This basic logic is sufficient for the construction of theories
in which empirical generalizations of the form Every A B. .-specimen is a
/C-specimen (represented in the calculus by a testable theorem
AXIOMS IN THE FORM OF IDENTIFICATIONS 433
(ar^br* ...)Ck, a being interpreted as designating the class of A-
specimens, and similarly for the other small italic letters) are explained
as deducible from initial propositions containing theoretical class-concepts
designated by AI, A«2, . . . (Simple examples of such theories are given in
[3, Chapters III and IV].) Since all the propositions concerned will be
universal propositions (i.e., of the form Every ... -specimen is a
specimen), every formula of the calculus is equivalent to a formula in
normal form ... — o.
Let @i be a calculus of this type comprising n identificatory axioms
DI, 1)2, . . . Dn identifying the n observable terms a\, U2, ... an by means
of / theoretical terms AI, A2, ... Aj.
Dr is ar = Ar, where Ar is a Boolean expression whose terms are all
theoretical terms. Let the calculus also comprise m Campbellian axioms
Ci, C<2, . . . Cm containing theoretical terms alone. Derive from Cr the
equivalent formula Fr — e, where I\ is a Boolean expression whose terms
are all theoretical terms, and let F be (T\ r\ F2r\ ... Fm).
Now consider a related calculus ©2 containing the same observable and
theoretical terms but with no Campbellian axioms. Let its n identificatory
axioms be E\, £2, . . . En, where Er is ar — (Ar o F). We will prove that
(under a weak condition) @2 is testably equivalent to @i in that the
testable theorems in each calculus, obtained in each case by eliminating
the theoretical terms from the axioms, are the same.
The proof depends upon the classical theory of elimination of variables
from Boolean equations and is a development of a result of A. N. White-
head [11, p. 60, (5) and p. 65, (1)]. Consider the 'universe' of the / theo-
retical terms AI, A2, . . .Aj (these are common to both Si and ©2). The
2l minimals (Ai r\ AS ri . . .Aj), (Ai r* A2 ^ . . .A/), . . . (Ai' o fa' ^ ... Aj')
form a partition of the universe (in accordance with the basic logic of
finite Boolean lattices), i.e. using a suffixed JJL to designate a minimal,
fir r^ fj,s = o for Y =}= s; U pi — e. Then Ar, the Boolean expression whose
i
terms are all theoretical terms which is identified with ar by the identi-
ficatory axiom Dr of @i, is the union of the minimals in some sub-set of
the minimals; and Dr is equivalent to ar = U pi and, in normal form, to
(arf ^ U pi) v (ar ri U pt) = o.
i'.fjuCAr i:mC/lr'
If D is Di.Dz ____ Dn, D is then equivalent to U (A$ o ^) = o, where
A,is U */w U aj).
434 R. B. BRAITHWAITE
If C is Ci.C2. . .Cm (the conjunction of the Campbellian axioms), C is
equivalent to U m = o. C . D is then equivalent to
U (At o fjn) w U (0 r» ^) = o.
The resultant in normal form RI of eliminating all the minimals from
C.D is 0 Ai = o. RI is equivalent to the conjunction of all the testable
theorems; so a testable formula T is a theorem of @i if and only if
R! D T.
By a similar argument applied to the axioms of ©2, Er is equivalent to
ar = U ^i and, in normal form, to
i:jt«C(Jr^r)
(ar' ^ U /^) w (<zr r* U /-*$) ^ (ar o U //$) — o.
If £ is Ei.E2. . .En, E is then equivalent to U (B$ o ^) — o, where,
i
for an t such that /*$ C jT, B^ is (U fly' w (J «;) ;
j:incAi jiptcAj'
for an t such that /^ C T', B^ is U cij.
f
The resultant in normal form /^ of eliminating all the minimals from E is
fl Bt- = o, which, since B$ — A?- for every i such that m C 1\ is equi-
i
valent to D A< ^ U fly = o. A testable formula T is a theorem of ©2
i:/*icr y
if and only if R2 D r.
Now impose the weak condition that F should not be wholly included
within U A], i.e. F =\= (/"o U Aj). Under this condition there is at
i i
least one minimal, say jLts, which is such that both ^8 C F and p,s C Aj
for every /. Then for this s, As — Bs — U af, and R2, like RI, is
/
fl A^ = o. Hence /^i = R2\ and T is a testable theorem of @i if and only
iifjiicr
if T is a testable theorem of @2.
Thus, unless the Campbellian axioms C of @i restrict the universe of
theoretical terms to a class F which is included in the union of all the
observable terms according to their identifications in @i, the calculus ©2,
constructed from ©i by omitting its Campbellian axioms and substituting
(Ar o 71) lor Ar in each of its identificatory axioms, is testably equivalent
to ©i in the sense that every testable theorem of the one is also a testable
theorem of the other.
AXIOMS IN THE FORM OF IDENTIFICATIONS 435
In Whitehead's language [11, p. 59] each identificatory axiom is un-
limiting with respect to all the theoretical terms simultaneously in the
sense that the resultant of eliminating the observable term from the
axiom is equivalent to o = 0, a theorem of the basic logic. A calculus
such as ©2 whose proper axioms are all identificatory therefore imposes
no limitation upon the universe (Whitehead's field) of the theoretical
terms. Such a limitation is imposed by the Campbellian axioms of @i.
But, if this limitation restricts the theoretical-term universe to a universe
which falls wholly within the class which is the union of all the observable
terms, it will be impossible in the future to adapt the scientific theory
expressed by a calculus using theoretical terms limited in this way to
explain new empirical generalizations relating some of the observable
concepts to new observable concepts not concerned in the original theory
[3, pp. 73ff.]. An axiomatization of a scientific theory which is capable of
being adapted in this way must not impose such a drastic limitation upon
its theoretical terms. So our result may be put in the form that to every
adaptable calculus comprising Campbellian axioms a testably equivalent
calculus can be constructed all of whose proper axioms are identificatory.
This result has been established only for a scientific system which
makes use of a very simple basic logic (that of finite Boolean lattices) ; and
the extent to which it can be generalized to apply to systems comprising
Campbellian hypotheses and using more powerful basic logics requires
investigation. That it is possible to have a theory using a mathematical
basic logic in whose calculus theorems are derived from identificatory
axioms alone is shown by such a simple example as that of explaining
02 _|_ i)2 _ \t where a and b stand for observably determined numbers, by
identifying a with sin 0 and b with cos 6, 0 being a theoretical 'parameter'.
One obvious qualification must be made. If identificatory axioms in the
form of a description a = (w)(<f>(x)) are permitted, and if their underlying
logic is similar to Russell's doctrine of descriptions in that (3x)(</>(x)) is
derivable from any formula containing (i#) (<£(#)), an identificatory axiom
for a calculus ©3 of the form ar = (w)(x = Ar.F == e) would imply both
ar == Ar and F = e, and all the axioms of @i would be derivable from
those of @3, a stronger system. But every theoretical scientist would
regard the proposal to substitute a theory expressed by ©3 for one ex-
pressed by @i as a logician's trick. So for scientific discussion the notion
of identificatory axiom must be restricted to one from which alone no
Campbellian theorem can be derived, i.e. an identificatory axiom must be
unlimiting with respect to all its theoretical terms simultaneously.
436 R. B. BRAITHWAITE
The possibility, in suitable cases, of constructing a testably equivalent
calculus comprising only identificatory proper axioms is very relevant
to the discussion among philosophers of science as to whether or not some
of the highest-level hypotheses of the scientific theory expressed by the
calculus should be regarded as analytic or logically necessary rather than
as factual or contingent. It is admitted by all empiricists that the con-
junction of all the hypotheses must be contingent, since together they
have empirically testable consequences. But, if the highest-level hypothe-
ses contain theoretical concepts, it is never from one of these hypotheses
alone but always from a conjunction of them that testable propositions
are deducible; and so the possibility is left open that some of these
hypotheses are not contingent, and hypotheses representing dictionary
axioms (e.g. the identificatory axioms considered in this paper) are
frequently held to be analytic. For example, A. J. Ayer [1, p. 13], in his
account of the "indirect verif lability" of scientific statements (which is
similar to mine), explicitly allows that the conjunctions whose con-
sequences are "directly verifiable" may include analytic statements, his
reason being that "while the statements that contain [theoretical] terms
may not appear to describe anything that anyone could ever observe, a
'dictionary' may be provided by means of which they can be transformed
into statements that are verifiable ; and the statements which constitute
the dictionary can be regarded as analytic". And E. Nagel [9, pp. 209f.],
in a recent discussion of my book [3], criticises me for my "disinclination
to regard as 'absolute' Norman Campbell's distinction between the
'hypotheses' and the 'dictionary' of a theory. In Campbell's analysis, the
hypotheses postulate just what relations hold between the purely theo-
retical but otherwise unspecified terms of a theory, while the dictionary
provides the co-ordinating definitions for some of the theoretical terms or
for certain functions of them". "Every testable theory must include a
sufficient number of co-ordinating definitions which are not subject to
experimental control" ; and, though Nagel never explicitly says that co-
ordinating definitions state analytic propositions, he declares that they
have "the status of semantic rules" and contrasts them with "factually
testable assumptions" and with "genuine hypotheses". The existence of
calculi with no Campbellian axioms representing "genuine hypotheses"
and the possibility in suitable cases of converting calculi having Camp-
bellian axioms into calculi with only identificatory proper axioms make
it impossible to ascribe a logically necessary status to what is represented
by the identificatory axioms taken all together. Since it is the whole set
AXIOMS IN THE FORM OF IDENTIFICATIONS 437
of the hypotheses that conjunctively are "subject to experimental
control", it is possible that some sub-set of them are not so subject. But
there would seem to be no good reason for placing any of the identif icatory
axioms in this latter category. Nagel goes so far as to say that in the
simplest calculus which I gave as an example [3, pp. 54ff], in which
a — (A r\ [t], b = (^ r> v), c = (v r\ A) are the axioms and (a r\ b) C cy
(b r\ c) C a, (c r» a) C b are the testable theorems, "the obvious (and I
think correct) alternative to Braithwaite's account is to construe two of
the equational formulas in the [axiom set] not as hypotheses but as
having the function of semantical rules . . . which assign partial meanings
to the theoretical terms and to count the remaining formula as a genuine
hypothesis when such definitory stipulations have once been laid down."
But he gives no way ol selecting the one "genuine hypothesis" from among
the three which appear in the completest symmetry.
There would seem to be a stronger case for regarding Campbellian
hypotheses as logically necessary and for accounting for the contingency
of the lowest-level generalizations by the contingency of the identifications
provided by identif icatory axioms. The function of Campbellian axioms
is always that of limiting the universe of the theoretical terms, left un-
limited by the identificatory axioms; and it can be said that, since the
theoretical scientist in constructing a theory to explain his empirical
generalizations has great liberty of choice in selecting his theoretical
terms, he may well in the act of selecting them impose a limitation upon
the 'degrees of freedom' of their universe by a set of Campbellian axioms,
and this limitation (i.e. the conjunction of the Campbellian axioms) will
never by itself be "subject to experimental control", since it is concerned
only with theoretical terms. But, in a calculus comprising both identi-
ficatory and Campbellian axioms, the testable theorems derivable from
the former axioms form only a sub-class of the testable theorems derivable
from the conjunction of all the axioms; so the Campbellian axioms may
be given an empirical interpretation by virtue of the testability of the
additional theorems which are derivable by adding them to the identi-
ficatory axioms. There is no adequate reason for refusing to interpret
every proper axiom in a calculus expressing a scientific theory as repre-
senting a contingent proposition, the empirical interpretation of the
axioms being given by the syntactical relations of the whole set of axioms
to the testable theorems derivable from them. The only exception would
be the uninteresting case in which a redundant theoretical term is intro-
duced into a calculus by an axiom identifying it with a logico-mathematical
438 R. B. BRAITHWAITE
function of other theoretical terms. Such a sterile axiom ([3], p. 113),
functioning merely as an abbreviatory device, may rightly be regarded as
analytic.
There is one other consideration which has tended to confuse the issue
in the minds of some scientists and philosophers of science. If, as is
usually the case, the basic logic of the scientific theory is expressible by a
calculus with axioms and theorems interpreted as propositions of logic
or mathematics, these theorems will contain no extra-logical constants;
and the use of one of these theorems in the derivation of a proper theorem
of the scientific calculus will require an intermediate step in which a
logical theorem is applied to the primitive terms concerned. If the logical
sub-calculus uses the device of variables, this application will be effected
by making substitutions of primitive terms for some or all of these
variables. The theorem so derived will not be a proper theorem of the
calculus, since the primitive terms will occur in it only vacuously, but
neither will it be a theorem of the logical sub-calculus since it will con-
tain primitive terms as extra-logical constants. Call such a theorem an
applicational theorem. (For example, the derivation of (a r\ b) C c from
a = (A ^ //), b = (p r\ v), c = (v r\ X) will require (if the basic logic is ex-
pressed as a Boolean calculus) the use of the applicational theorem
(ft r» JLI) = ft, which is not itself a theorem of a Boolean calculus but is
derived from the Boolean theorem (or axiom) {% r\ %] = x, where % is a
free variable with class symbols as substitution values.)
Applicational theorems fall in the no-man's-land between the theorems
of the basic-logic sub-calculus and the proper theorems of the calculus.
If the scientific part of the whole calculus is regarded not (as we have
thought of it) as consisting of proper axioms and theorems (i.e. those
containing primitive terms non- vacuously) , but as consisting of all the
axioms and theorems which are not comprised in the basic-logic sub-
calculus (i.e. those which contain primitive terms either vacuously or
non- vacuously), then the applicational theorems will be classed as falling
within the scientific part. Since they will usually function there as
premisses from which, together with the proper axioms, proper theorems
are derived, and will not themselves be derived within this scientific part,
it will be natural to class them, within this scientific part, with the proper
axioms rather than with the proper theorems. A person who takes this
point of view will then hold that the scientific part of the calculus com-
prises axioms which are to be interpreted as representing logically
necessary propositions, these 'pseudo-axioms' being applicational theo-
AXIOMS IN THE FORM OF IDENTIFICATIONS 439
rems whose interpretations are logically necessary by virtue of being
applications to the concepts concerned of the laws of logic or mathematics.
When, as is usually the case, the primitive terms concerned in the appli-
cational pseudo-axioms are theoretical terms, these pseudo-axioms will
simulate Campbell ian axioms ; and if the calculus is one all of whose proper
axioms are identificatory, it will be described by a person who mistakes
such applicational pseudo-axioms for Campbellian axioms not as a calcu-
lus with no Campbellian axioms, but as a calculus whose Campbellian
axioms represent Campbellian hypotheses which are logically necessary.
If this person also does not regard identificatory axioms as representing
" genuine hypotheses", he may well assert that all the "genuine hypothe-
ses" of the theory expressed by the calculus are logically necessary or a
priori.
In our own time the thesis that the fundamental laws of physics are a
priori has been maintained by A. S. Eddington, who has attempted to
infer them, including the pure numbers which occur in them, from
"epistomological considerations" ([7]), p. 57). The reasons Eddington
gave at different places in his writings for his general thesis are different
and doubtfully consistent, but his principal reason would seem to be an
argument on Kantian lines that "the fundamental laws and constants
of physics . . . are a consequence of the conceptual frame of thought into
which our observational knowledge is forced by our method of formulating
it, and can be discovered a priori by scrutinising the frame of thought"
([7], p. 104). Such a view is incompatible with an empiricist philosophy of
science. But Eddington's programme of constructing a unified theory for
physics whose fundamental hypotheses are to be a priori appears in a new
light if his goal is described negatively as a theory having no contingent
Campbellian hypotheses. For his goal would then, on our way of thinking,
be a theory with no Campbellian hypotheses at all, represented by a
calculus whose proper axioms were all identificatory; and we should
explain his attribution of apriority to such a theory by his having mistaken
for Campbellian axioms the applicational pseudo-axioms required to
apply the basic logic to the concepts of the theory. And a programme
of constructing a Campbellian-hypothesis-free system of physics, un-
hopeful though it may appear to a physicist, is not ridiculous to an em-
piricist philosopher.
Perhaps because Eddington was not interested in axiomatics this way
of looking at his programme never, it seems, occurred to him. But
scattered throughout his writings (e.g. [6], pp. 3, 242; [7], pp. 41, 134;
440 R. B. BRAITHWAITE
[8], p. 265) are many references to the essential part to be played by
"identification" and "definition" in relating observation to theory, and
he does not suppose that these identifications are a priori: "we cannot
foresee what will be the correspondence between elements in [the] a priori
physical description and elements in our familiar apprehension of the
universe" [7, p. 134]. That Eddington's ideal was a system with no Camp-
bellianaxioms is suggested by his preferring the theory of numbers to
geometry as an analogue for a system of physics: "If the analogy with
geometry were to hold good, there would be a limit to the elimination of
hypothesis, for a geometry without any axioms at all is unthinkable.
But . . . [in] the theory of numbers . . . there is nothing that can be called
an axiom. We shall find reason to believe that this is in closer analogy
with the system of fundamental laws of physics" [7, p. 45]. So I think it is
a fair, and charitable, gloss on Eddington to take his programme as the
constructing for the whole of physical theory of an identificatory system,
whose axiomatization would comprise only identificatory proper axioms,
in contrast with the programme of all other theoretical physicists of con-
structing Campbellian systems, whose axiomatization would comprise
Campbellian as well as identificatory axioms.
Can anything in general be said as to the relative advantages of con-
structing Campbellian or identificatory systems as explanatory scientific
theories? Not much more, I think, than that, since the calculus expressing
a Campbellian system will be stronger (by virtue of comprising Camp-
bellian axioms and theorems) than a testably equivalent identificatory
calculus (in which no Campbellian theorem can be derived), a Campbellian
system can probably be more easily adapted in the future to explain new
empirical generalizations, as is illustrated in the history of physics by the
great adaptabilitity of systems which included the conservation of energy as
a Campbellian hypothesis. An identificatory system would seem to be the
more appropriate one for providing the most economical theory to ex-
plain a closed set of empirical generalizations. But it may well be the case
that there are subjects, perhaps those of some of the social sciences, in
which identificatory systems are those which arise most naturally in
reflecting upon the subject-matter concerned. The development of the
social sciences has been retarded by a false belief that numerical mathe-
matics provides the only deductive techniques so that, to construct a
scientific theory, it is necessary that both the observable and the theo-
retical concepts of a science should be numerically measurable. It may
also -have been retarded by a false belief that a science can only use
AXIOMS IN THE FORM OF IDENTIFICATIONS 441
theoretical concepts if these can be related together in Campbellian
hypotheses. A realization by social scientists that there is no need to
imitate the methods of theory-construction which have proved so success-
ful in the physical sciences, and that theories whose theoretical concepts
occur only in hypotheses 'identifying' the observable concepts are perfect-
ly good explanatory theories (provided, of course, that testable con-
sequences can be deduced from these hypotheses) , might encourage them
to a greater boldness in thinking up theoretical concepts and trying out
theories containing them. This sort of encouragement is the contribution
a philosopher of science can make the progress of science.
One last and philosophical remark. To identify, by means of an identi-
ficatory axiom, an observable term with a logico-mathematical function
of theoretical terms in a calculus expressing a scientific theory is one way
of explicating (in the sense of R. Carnap [5, Chapter I]) the "inexact
concept" for which the observable term stands in ordinary language. To
propose a scientific theory containing theoretical concepts which is to be
testable against experience involving inexact concepts requires expli-
cations of these concepts; and, if the theory is an identificatory system,
the hypotheses of the theory will consist entirely of such explications.
Conversely, a set of explications by means of theoretical concepts will
constitute the hypotheses of an identificatory system ; and, if this system
permits the deduction of empirically testable consequences, it will be a
scientific theory. A philosopher propounding such a system of explications
must not be dismissed as a rationalist metaphysician on the sole ground
that the hypotheses of his system appear all in the form of new 'deli-
nitions'. His system will only fail to be scientific if nothing empirical
follows from all his definitions taken together.
Bibliography
[1] AVER, A. J., Language Truth and Logic (2nd edition). London 1946, 160 pp.
[2] BETH, E W., On Padoa's method in the theory of definition. Indagationes
Mathcmaticae, vol. 15 (1953), pp. 330-339.
[3] BRAITHWAITE, K. B., Scientific Explanation. Cambridge 1953, X -f 376 pp.
[4] CAMPBELL, N. K., Physics The Elements. Cambridge 1920, X -f 565 pp.
[5] CARNAP, R., Logical Foundations of Probability. Chicago 1950, XVIII -f 607
pp.
[6] EDDINGTON, (Sir) A. S., Relativity Theory of Protons and Electrons. Cambridge
1936, VIII -f 336 pp.
442 R. B. BRAITHWAITE
[7] , The Philosophy of Physical Science. Cambridge 1939, X + 230 pp.
[8] , Fundamental Theory. Cambridge 1946, VIII -f 292 pp.
[9] NAGEL, E., A budget of problems in the philosophy of science. The Philosophical
Review, vol. 66 (1957), pp. 205-225.
[10] TARSKI, A., Some methodological investigations on the definability of concepts.
Chapter X in Logic, Semantics, Metamathematics. Oxford 1956, XIV -f- 471 pp.
[11] WHITEHEAD, A. N., Universal Algebra, vol. 1, Cambridge 1898, XXVI + 586
pp.
Symposium on the Axiomatic Method
DEFINABLE TERMS AND PRIMITIVES IN AXIOM SYSTEMS
HERBERT A. SIMON
Carnegie Institute of Technology Pittsburgh, Pennsylvania, U.S.A.
An axiom system may be constructed for a theory of empirical phe-
nomena with any of a number of goals in mind. Some of these goals are
identical with those that motivate the axiomatization of mathematical
theories, hence relate only to the formal structure of the theory — its
syntax. Other goals for axiomatizing scientific theories relate to the
problems of verifying the theories empirically, hence incorporate
semantic considerations.
An axiom system includes, on the one hand, entities like primitive
terms, defined terms, and definitions, and on the other hand, entities like
axioms, theorems, and proofs. Tarski [10, p. 296] has emphasized the
parallelism between the first triplet of terms and the second. The usual
goals for axiomatizing deductive systems are to insure that neither more
nor less is posited by way of primitive terms and axioms than is necessary
and sufficient for the formal correctness of the definitions and proofs, and
hence the derivability of the defined terms and theorems. An axiom sys-
tem is usually accompanied by proofs of the independence, consistency,
and completeness of its axioms; and presumably should also be ac-
companied — although it less often is — by proofs of the independence,
consistency, and completeness of its primitive terms.
Frequently a set of sentences (axioms and theorems) and terms admits
alternative equivalent axiom systems: that is non-identical partitionings
of the sentences into axioms and theorems, respectively ; and of the terms
into primitive and defined terms. Hence, a particular set of axioms and
primitive terms may be thought of as a (not necessarily unique) basis for
a class of equivalent axiom systems.
In constructing an axiom system for an empirical theory, we may
wish to distinguish sentences that can be confronted more or less directly
with evidence (e.g., "the temperature of this water is 104°") from other
sentences. We may wish to make a similar distinction between predicates,
functors, and other terms that appear in such sentences (e.g., "temper-
ature") and those that do not.. The terms "observation sentences" and
443
444 HERBERT A. SIMON
"observables" are often iused to refer to such sentences and such terms,
respectively. l
The distinction between observables and non-observables is useful in
determining how fully the sentences of a theory can be confirmed or dis-
confirmed by empirical evidence, and to what extent the terms of the
theory are operationally defined. In addition to the formal requirements,
discussed previously, we might wish to impose the following additional
conditions on an axiom system for an empirical theory:
( 1 ) that the entire system be factorable into a subsystem that is equi-
valent to some axiom system for a part of logic and mathematics, and a
remainder ;
(2) that in the remainder, axioms correspond to observation sentences,
and primitive terms to observables.
Condition (2) is, of course, a semantic rather than a syntactic condition,
and has no counterpart in the axiomatization of mathematical theories.
The usefulness of the condition is that, if it is met, the empirical testability
of observation sentences guarantees the testability of all the sentences in
the system, and the operational definability of observables guarantees the
operationality of all the terms. In the remainder of this paper we shall
explore some problems that arise in trying to satisfy Condition (2), and
some modifications in the notion of definability — as that term is used in
formal systems — that are needed to solve these problems.
The question of what characteristics an axiom system should possess
has been raised in the past few years [9] in connection with the definability
of mass in Newtonian mechanics. In one recent axiomatization of New-
tonian particle mechanics [5] particular care is taken to meet the syntactic
conditions for a satisfactory axiomatization, and mass is introduced as a
primitive term. In another axiomatization [8] special attention is paid to
semantic questions, and definitory equations for mass are introduced.
Definability and Generic Definability. Tarski [10] has proposed a
definition of the term definability in a deductive system, and has shown
how this definition provides a theoretical foundation for the method
employed by Padoa [6] to establish whether particular terms in a system
are definable or primitive. In their axiomatization of classical particle
mechanics, McKinscy, Sugar and Suppes [5, Paragraph 5] employ the
method of Padoa to show that, by Tarski 's definition, mass and force are
1 For a more extended discussion of these terms, see [2, pp. 454—456].
DEFINABLE TERMS AND PRIMITIVES IN AXIOM SYSTEMS 445
primitive terms in their system. Application of the same method to Si-
mon's earlier axiomatization of Newtonian mechanics [8] gives the same
result — mass and force are primitives in that system.
The latter result appears to conflict with common-sense notions of
definability, since in [8] the masses of the particles can (in general) be
computed when their positions and accelerations are known at several
points in time [8, Theorem I]. Condition (2) of the previous section is
violated if masses, which are not observables, are taken as primitive
terms; and it appears paradoxical that it should be possible to calculate
the masses when they are neither observables nor defined terms. These
difficulties suggest that Tarski's concept of definability is not the most
satisfactory one to use in the axiomatization of empirical science.
A closer examination of the situation, for [8], shows that the masses are
not uniquely determined in certain situations that are best regarded as
special cases — e.g., the case of a single unaccelerated particle. It is by
the construction of such special cases, and the application of the method
of Padoa to them, that McKinsey, Sugar and Suppes show mass to be a
primitive in [5], and by inference in [8]. But I shall show that if the defi-
nition of Tarski is weakened in an appropriate way to eliminate these
special cases it no longer provides a justification for the method of Padoa,
but does provide a better explication of the common-sense notion of
definability.
Statement of the Problem. We shall discuss the problem here in an
informal manner. The treatment can easily be formalized along the lines
of Tarski's paper. 2 In Tarski's terms [10, p. 299], the formula (f>(x ; b' ', b" ', . . )
defines the extra-logical constant a if, for every x, % satisfies </> if and only
if x is identical with a\ i.e., if:
(I) (x):x = a.^.<l>(x;b',b", ...),
where x is the only real variable in <f>, and b', b", ... are the members of a set
of extra-logical constants (primitives andl or defined terms] .
Translated into these terms, the (attempted) definition of "the mass of
particle i" in [8, p. 892] proceeds thus: (1) We take as the function <£ the
conjunction of the six scalar equations that state the laws of conservation
of momentum and conservation of angular momentum for a system of
particles. (2) We take as the set B the paths of the particles in some time
2 Compare also [2, p. 439].
446 HERBERT A. SIMON
interval. (3) We take as x the set of numbers m^ that satisfy </> for the
given B.
This procedure does not satisfy Tarski's definition since the existence
and uniqueness of the masses is not guaranteed. For example, in the case
of a single, unaccelerated particle, any number, m, substituted in the
equations for conservation of momentum and angular momentum will
satisfy those equations. But Tarski shows (his Theorem 2) that if two
constants satisfy a definitory formula for a particular set, Bt they must
be identical.
Generic Definition. To remove the difficulty, we replace Tarski's
definition with a weaker one: the formula <£(#; b', 6", . . .) DEFINES
GENERICALLY the extralogical constant a if, for every x, if x is identical with,
a, x satisfies <f>:
(I') (x):x = 0.D. </»(*;&',&", . . .)•
After the equivalence symbol in formula (I) has been replaced by an
implication in this way, the three theorems of Tarski's paper are no
longer provable. In particular, formula (7) in his proof of Theorem I [10,
pp. 301-302] can no longer be derived from the modified forms of his
formulas (3) and (6). Hence, the method of Padoa cannot be used to
disqualify a proposed generic definition.
It is easy to show that in [8] mass is generically defined by means of the
paths of the particles on the basis of the Third Law of Motion (more
exactly, the laws of conservation of momentum and angular momentum) ;
and that resultant force is generically defined by means of the paths of
the particles and their masses on the basis of the Third and Second Laws
of Motion [8, p. 901]. Similarly, we can show that in [5, p. 258] resultant
force is generically defined by means of the paths of the particles and their
masses on the basis of the Second Law of Motion.
The advantage of substituting generic definition for definition is that,
often, a constant is not uniquely determined for all possible values of the
other extra-logical constants, but experimental or observational circum-
stances can be devised that do guarantee for those circumstances the
unique determination of the constant.
In the axiom system of [8], for example, the conditions under which
masses exist for a system of particles and the conditions under which
these masses are unique have reasonable physical interpretations. The
observables are the space-time coordinates of the particles. From a
DEFINABLE TERMS AND PRIMITIVES IN AXIOM SYSTEMS 447
physical standpoint, we would expect masses (not necessarily unique) to
be calculable from the motion of a set of particles, using the principles
of conservation of momentum and angular momentum, whenever this
set of particles was physically isolated from other particles. Moreover,
we would expect the relative masses to be uniquely determined whenever
there was no proper subset of particles that was physically isolated from
the rest. These are precisely the conditions for existence (Definition 3)
and uniqueness (Theorem 1 and Definition 6) of the masses in this axio-
matization. Thus, the definition of mass in [8] does not lead to a unique
determination of the mass of a single star at a great distance from other
stars, but does permit the calculation, uniquely up to a factor of pro-
portionality, of the masses of the members of the solar system from obser-
vation of their paths alone, and without postulating a particular force
law [8, pp. 900-901].
OTHER CONCEPTS OF DEFINABILITY
The sharp distinctions between axioms and theorems, and between
primitive and defined terms have proved useful dichotomies in axio-
matizing deductive systems. We have seen that difficulties arise in pre-
serving the latter distinction in empirical systems, when the axiom system
is required to meet Condition (2) — when primitive terms are identified
with observables. But it has long been recognized that comparable
difficulties arise from the other half of Condition (2), that is, from the
identification of axioms with observation sentences. In our axiomatization
of Newtonian mechanics, for example, the law of conservation of momen-
tum, applied to an isolated system of particles, is an identity in time
containing only a finite number of parameters (the masses). If time is
assumed to be a continuous variable, this law comprises a nondenumer-
able infinity of observation sentences. Hence, the law is not itself an
observation sentence nor is it derivable from a finite set of observation
sentences.
The two difficulties — that with respect to axioms and that with respect
to primitives — arise from analogous asymmetries. In a system of New-
tonian mechanics, given the initial conditions and masses of a system of
particles, we can deduce univocally their paths. Given their paths, we
may or may not be able to derive unique values for the masses. Given the
the laws and values of the generically defined primitives, we can deduce
observation sentences; given any finite set of observation sentences, we
448 HERBERT A. SIMON
cannot generally deduce laws. When the matter is put in this way, the
asymmetry is not surprising, and it is easy to see that the thesis of naive
logical positivism — essentially the thesis of Condition (2) — is untenable
unless it is weakened substantially.
Contextual Definitions, Implicit Definitions and Reduction Sentences.
Revisions of the concept of definition similar in aim to that discussed here
have been proposed by a number of empiricists. Quine's [7, p. 42] notion of
contextual definition, while nowhere spelled out formally, is an example:
The idea of defining a symbol in use was, as remarked, an advance over the
impossible term-by-term empiricism of Locke and Hume. The statement,
rather than the term, came with Frcge to be recognized as the unit accountable
to an empiricist critique. But what I am now urging is that even in taking the
statement as unit we have drawn our grid too finely. The unit of empirical
significance is the whole of science.
Braithwaite [1] carries the argument a step further by pointing out
advantages of having in an empirical theory certain terms that are not
uniquely determined by observations. His discussion of this point [1, pp.
76-77] is worth quoting :
We can, however, extend the sense of definition if we wish to do so. In explicit
definition, which we have so far considred, the possibilities of interpreting a
certain symbol occurring in a calculus are reduced to one possibility by the
requirement that the symbol should be synonymous (within the calculus) with
a symbol or combination of symbols which have already been given an inter-
pretation. But the possibilities of interpreting a certain symbol occurring in a
calculus may be reduced without being reduced to only one possibility by the
interpretation already given of other symbols occurring in the formulae in the
calculus. If we wish to stress the resemblance between the reduction of the
possibilities of interpreting a symbol to only one possibility and the reduction
of these possibilities but not to only one possibility, instead of wishing to stress
(as we have so far stressed) the difference between these two soits of reduction,
we shall call the second reduction as well as the first by the name of definition,
qualifying the noun by such words as "implicit" or "by postulate." With this
extension of the meaning of definition the thesis of this chapter can be ex-
pressed by saying that, while the theoretical terms of a scientific theory are
implicitly defined by their occurence in initial formulae in a calculus in which
there are derived formulae interpreted as empirical generalizations, the theo-
retical terms cannot be explicitly defined by means of the interpretations of the
terms in these derived formulae without the theory thereby becoming in-
capable of growth.
DEFINABLE TERMS AND PRIMITIVES IN AXIOM SYSTEMS 449
As a final parallel, I will mention Carnap's concept of reduction sentence
in his essay on Testability and Meaning [2, p. 442]. A reduction sentence
for £3 is a sentence of the form, Q% D (Q\ D $3), where Q% is interpreted as
the set of conditions under which the subsidiary implication holds, and
where Qi is interpreted as a (partial) definiens for Q& Thus, let Q% be the
statement that a set of particles is isolated; Q\ be the statement that a
certain vector, m, substituted for the coefficients in the equations stating
the laws of conservation of momentum and angular momentum for the
particles, satisfies those equations; and Q$ be the statement that the
components of m are masses of the particles. Then Q% D (Qi D $3) is
essentially identical with the definition of mass in [8]. The subsidiary
connective is an implication rather than an equivalence because there is
no guarantee that another vector, m' , may not also constitute a satis-
factory set of masses, so that Q% D (Q\ D (V), where Qi is derived from
Q\, and Q$ from Q$ by substituting mf, for m.
Definability Almost Everywhere. In preference to either definability or
generic definability, we might want to have a term midway in strength
between these two — a notion of definability that would guarantee that
we could "usually" determine the defined term uni vocally, and that the
cases in which we could not would be in some sense exceptional. Under
certain conditions it is, in fact, possible to introduce such a term. Suppose
that B is a point in some space possessing a measure, and let there be a
sentence of form (I) that holds almost everywhere in the space of B. Then, we
say that a is DEFINED ALMOST EVERYWHERE.
If, in [8], we take B as the time path of the system which satisfies the
axioms in some interval k < t < m, and take the Lebesgue measure in
the appropriate function space for the B's as the measure function, then
mass is defined almost everywhere, as is resultant force.
DEFINABILITY AND IDENTIFIABILITY
It has not generally been noted that the problem of definability of non-
observables in axiomatizations of empirical theories is identical with what
has been termed the "identification problem" in the literature of mathe-
matical statistics [4, p. 70; 9, pp. 341-342]. The identification problem is
the problem of estimating the parameters that appear in a system of
equations from observations of the values of the variables in the same
system of equations.
450 HERBERT A. SIMON
Some Types of Identifiability Problems. Consider, for example, a system
of linear equations:
(1)
where the x's are observables and the a's and b's are parameters. The
a's and b's are generically defined by this system of equations, but they
are not defined in Tarski's sense, for, no matter how many sets of obser-
vations of the x's we have, the a's and b's are not uniquely determined.
For suppose that A and b are a matrix and vector, respectively, that
satisfy (1) for the observed x's. 3 Then A' and b' will also satisfy (1), where
A' — PA and b' = Pb for any non-singular matrix P. To identify the a's
and 6's — that is, to make it possible to estimate them uniquely — ad-
ditional constraints beyond those embodied in equations (1) must be
introduced.
On the other hand, consider the system of linear difference equations:
(2) 2 «tf**W = **(' + 1). (*'= 1, ...,«)
where, as before, the x's are observables, and the a's and b's constant
parameters. In this case, the a's are defined almost everywhere in the
space of x(t). There are n2 parameters to be estimated, and the number
of equations of form (2) available for estimating them is n(k — 1), where
k is the number of points in time at which the x's are observed. Hence, for
almost all paths of the system, and for k > n -f- 1, the a's will be de-
termined uniquely. 4
We see that the system of equations (2) is quite analogous to the system
of equations used in [8] to define mass. In the latter system, for n particles,
having 3n position coordinates, there arc 6 second order differential
equations (three for conservation of momentum, three for conservation
of angular momentum) that are homogeneous in the m's, and that must
hold identically in t. There are (n — 1 ) parameters to be estimated — the
number of mass-ratios of the particles, referred to a particular one of
them as unit. Hence, for almost all paths of the system, the mass-ratios
3 In this entire discussion, we are disregarding errors of observation and the fact
that the equations may be only approximately satisfied. For an analysis that takes
into account these additional complications, the reader must refer to [3] and [4],
4 The convenience of replacing identifiability (equivalent to Tarski's definability)
by almost-everywhere identifiability (equivalent to almost -every where definability)
has already been noted in the literature on the identification problem [4, p. 82; 3,
p. 53].
DEFINABLE TERMS AND PRIMITIVES IN AXIOM SYSTEMS 451
can be estimated uniquely from observations of the positions of the
particles at f \- 2J points in time.
Correspondingly, the system of equations (1) is analogous to the
system of equations used in [8, p. 901] to define the component forces
between pairs of particles. Component forces are only generically defined.
Hence, although the masses of particles in a system and the resultant
forces acting upon them can, in general, be estimated if there is a sufficient
number of observations of the positions of the particles; the component
forces cannot be so estimated unless additional identifying assumptions
are introduced. Such additional assumptions might, for example, take
the form of a particular force law, like the inverse square law of gravi-
tational attraction.
Over-Identification and Testability. When a scientific theory is axio-
matized with a view to clarifying the problems of testing the theory, a
number of considerations are present that do not appear in axiomatizing
deductive systems. Hence, it may be undesirable to imitate too closely
the canons usually prescribed for the latter type of axiomatization. In
addition to distinguishing primitive from defined terms, it may be
advantageous to subdivide the former class as so to distinguish terms that
are defined almost everywhere or that are only generically defined.
More fundamentally, whether particular terms are univocally deter-
mined by the system will depend not only on the specific sentences that
have the form of definitions of these terms, but upon the whole set of
sentences of the system. Our analysis of an actual axiom system for
Newtonian particle mechanics bears out the contentions of Braithwaite
and Quine that the definitions of non-observables often are, and must
be, "implicit" or "contextual."
What does the analysis suggest, on the positive side, as a substitute for
the too strict Condition (2) ? In general, there will appear in an axiom
system terms that are direct observables, and terms that are not. A
minimum requirement from the standpoint of empiricism is that the
system as a whole be over-identified: that there be possible sets of
observations that would be inconsistent, collectively, with the sentences
of the system. We have seen that this condition by no means guarantees
that all the non-observables of the system will be defined terms, or even
defined almost-e very where.
A more radical empiricism would require that it be possible, by making
452 HERBERT A. SIMON
a sufficient number of observations, to determine uniquely the values of all
parameters that appear in the system. To take a simple example, a strict
interpretation of this condition would not permit masses to appear in the
axiomatization of Newtonian mechanics, but only mass-ratios. Resultant
forces would be admissible, but not component forces, unless sufficient
postulates were added about the form of the force law to overdetermine
them. We may borrow Quine's phrase for this requirement, and say that
when it is satisfied for some set of terms, the terms are defined contextually
by the system. 5 The condition that all non-observables be defined con-
textually is still much weaker, of course, than the condition that they be
defined.
For reasons of elegance, we may sometimes wish to stop a little short
of insisting that all terms in a system be defined contextually. We have
already mentioned a suitable example of this. In [8] mass ratios are defined
almost everywhere, but masses are not defined contextually, even in an
almost-everywhere sense. Still, we would probably prefer the symmetry of
associating a mass number with each particle to a formulation that
arbitrarily selected one of these masses as a numeraire.
Braithwaite has given us another reason, from the semantic side, for
not insisting on contextual definition of all terms. He observes that if we
leave some degrees of freedom in the system, this freedom allows us later
to add additional axioms to the system, without introducing internal
inconsistencies, when we have reason to do so. Thus, since the law of
conservation of energy does not determine the zero of the temperature
scale, the zero may be fixed subsequently by means of the gas laws.
Regardless of what position we take on empiricism in axiomatizing
scientific theories, it would be desirable to provide for any axiom system
theorems characterizing not only its syntactical properties (e.g., the
independence, consistency, and completeness of the axioms), but its
semantic properties (e.g., the degree of identifiability of its non-observa-
bles) as well.
5 Braithwaite's "implicit definition" will not do here, for he applies it specifically
to the weaker condition of the previous paragraph.
DEFINABLE TERMS AND PRIMITIVES IN AXIOM SYSTEMS 453
Bibliography
[1] BRAITHWAITE, R. B., Scientific Explanation. Cambridge 1955, XII -f 376 pp.
[2] CARNAP, R., Testability and meaning. Philosophy of Science, vol. 3 (1936), pp.
419-471, and vol. 4 (1937), pp. 1-40.
[3] HOOD, W.. and T. C. KOOPMANS (eds.) Studies in Econometric Method. New
York 1953, XIX + 323 pp.
[4] KOOPMANS, T. C. (ed.), Statistical Inference in Dynamic Economic Models.
New York 1950, XIV -f 439 pp.
[5] McKiNSEY, J. C. C., A. C. SUGAR and P. SUPPES, Axiomatic foundations of
classical particle mechanics. Journal of Rational Mechanics and Analysis, vol.
2 (1953), pp. 253-272.
[6] PADOA, A., Essai d'une Movie algdbrique des nombres entiers, precedt d'une
introduction logique a une theorie deductive quelconque. Bibliotheque du Con-
gres International de Philosophic, vol. 3 (1900).
[7] QUINE, W., From aLogical Point of View. Cambridge (Mass.) 1953, VI -f- 184
pp.
[8] SIMON, H. A., The axioms of Newtonian mechanics. Philosophical Magazine,
sen 7, vol. 33 (1947), pp. 888-905.
[9] , Discussion', the axiomatization of classical mechanics. Philosophy of
Science, vol. 21 (1954), pp. 340-343.
[10] TARSKI, A., Some methodological investigations on the definability of concepts.
Chapter 10 in Logic, Semantics, Metamathematics, Oxford 1956, XIV -f 467 pp
Symposium on the Axiomatic Method
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS
KARL MENGER
Illinois Institute of Technology, Chicago, Illinois, U.S.A.
The topic of this paper is a theory of some basic applications of mathe-
matics to science. Part I deals with concepts of pure mathematics such
as the logarithm, the second power, and the product, and with substi-
tutions in the realm of those functions. Part II is devoted to scientific
material such as time, gas pressure, coordinates — objects that Newton
called fluents. Part III formulates articulate rules for the interrelation of
fluents by functions. Properly relativized, the latter play that connective
role for which Leibniz originated the term function.
I. FUNCTIONS
Explicitly, a real function with a real domain — briefly, a function —
may be defined as a class of consistent ordered pairs of real numbers.
Here and in the sequel, two ordered pairs of any kind are called con-
sistent unless their first members are equal while their second members
are unequal. If each pair e /i (that is, belonging to the function /i) is also
e /2 — in symbols, if /i C /2 — then /i is called a restriction of /2 ; and /a an
extension of f\. The empty function (including no pair) will be denoted
by 0. The class of the first (the second) members of all pairs e / is called
the domain of / or dom / (the range of / or ran /) . If ran / includes exactly
one number, then / is said to be a constant function.
The following typographical convention * will be strictly adhered to:
roman type for numbers; italic type for functions.
For instance, the logarithmic function — briefly, log — is the class of all pairs
(a, log a) for any a > 0. The constant function consisting of all pairs (x, 0) for
any x will be denoted 2 by O. The following arc examples of a formula and a
general statement, respectively: log e = 1, and 0 C / for any /. Here, 0, 1, e as
1 Cf. Monger [10] referred to in the sequel as Calculus.
2 Symbols for constant functions that are more elaborate than italicized nu-
merals, such as Ci and CQ, must be used in order to express certain laws; e.g., that
CQ+I '= c0 + c\.
454
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS 455
well as log, O, and 0 are designations of specific entities, while a, x, and / are
variables (i.e., symbols replaceable with the designations of specific entities
according to the respective legends) — number variables or function variables as
indicated typographically.
The intersection of any two functions is a function; e.g., that of cos and
sin is the class of all pairs ((4n + 1)^/4, (—\)n/^/2) for any integer n.
The union of cos and sin, however is not a function. From the set-theo-
retical point of view, functions do not constitute a Boolean algebra 3. But
any two functions have a sum, a difference, a product, and a quotient
provided — is defined as the class of all pairs (x, q) such that (x, pi) e/i,
/2 Pi
(x, p2) e /2 and — - — q for some pi and p2 — a definition that dispenses
P2
with any reference to zeros in the denominators. For instance,
cot = -- — ,-jr = 0, and -^ ./2 C fi for any /, /i, and /2.
tan 0 /2
Moreover, any function /2 may be substituted into any function /i, the
result /i/2 (denoted by mere juxtaposition, whereas multiplication will
always be denoted by a dot !) being the class of all pairs (x, z) such that
(x, y) E /2 and (y, z) e /i for some y. The identity function, i.e., the class
of all pairs (x, x) for any x — an object of paramount importance — will
be denoted 4 by /. Its main property is bilateral neutrality under sub-
stitution:
(1) // = / = // for any /.
For each /, there is a bilaterally inverse function 5, Inv /, which is the
largest class of pairs (x, y) such that (y, x) e/ and that, under substitu-
tion, / Inv / C j and (Inv /)/ C /. For instance, Inv /3 = /* and Inv exp =
log. If /+2 is the class of all pairs (x, x2) for any x > 0, then
Inv y+2 = /*, Inv/* = /+2; similarly, Inv /_2 = — /*, Inv — /*=/_2.
But Inv j2 consists of the single pair (0, 0) ; and Inv cos = 0, while Inv /
3 For this reason, the postulational theories of binary relations, which are based
on Boolean algebra (cf. especially, McKinsoy [8] P- 85 and Tarski [22] p. 73), are
inapplicable to functions.
4 Cf. Calculus, p. 74 and pp. 99-105. Cf. also Menger [11] and [12].
5 Cf. Calculus, pp. 91-95, where Inv/ is denoted by ////. The fertility of this
concept of inverse functions has been brought out by M. A. McKierman's interesting
and promising studies on operators. Cf. McKiernan [6], [7].
456 KARL MENGER
is a branch of arccos if / C cos and dom / is the interval [nn, (n + \)ri\, for
some integer n.
In the traditional literature, the identity function has remained anonymous —
one of the symptoms for the neglect of substitution in analysis. The usual
reference — " the function x" — and the symbol x are complete failures in
basic assertions. Even in order to assert substitutive neutrality, concisely ex-
pressed in (1), analysts are forced to introduce a better symbol than x — an ad
hoc name of the identity function, say h — and then must resort to an awkward
implication : If A(x) = x for each x, then h(f(x)) = /(x) = /(/&(x)) for any /
and any x G dom /.
The overemphasis on additive-multiplicative processes, which is
characteristic of mathematics in the second quarter of this century,
becomes particularly striking in passing from theories of functions based
on explicit definitions to postulational theories — theories of rings of
functions, of linear function spaces, etc., which stress those properties
that functions or entities of any kind share with numbers. One of the few
exceptions doing justice to substitution is the trioperational algebra of
analysis 6. In it, functions are (undefined) elements subject to three (un-
defined) operations. With regard to the first two, denoted by + and . ,
the elements constitute a ring including neutral elements, 0 and 1. The
third, called substitution and denoted by juxtaposition, is associative and
right-distributive with regard to the ring operations 7 :
(2) (/ + g)h = fh + gh and (f.g)h = fh.gh for any /, g, h.
For many purposes, it is important to postulate a neutral element /
satisfying (1).
Trioperational algebra has interesting applications to rings of poly-
nomials as well as non-polynomials 8 but does not apply to the realm of
all functions, evgn though the three operations can be defined for any two
functions. The only ring postulate that is not generally satisfied is that,
for each g, there exist an / such that / + g = 0. For instance, — log + log
is not 0, but rather the restriction of 0 consisting of all (x, 0) for x > 0.
Only / + log C 0 has solutions (namely, any / C — log). What narrowed
6 Cf. Menger [13], [14].
7 In keeping with the traditional attitude toward substitution, the laws (2) are
hardly ever mentioned even though they are as important in analysis as is the
multiplicative-additive distributive law.
8 Cf. especially Milgram [18] p. 65, Heller [4] and Nobauer [19].
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS 457
the scope of trioperational algebra in its original form was the fact that it
did not take the relation C into account.
A more satisfactory postulational approach to functions may be based
on the following idea of a hypergroup: a set ^ satisfying six postulates:
I. 3? is partially ordered by a relation C. For some purposes it is con-
venient to assume that ^ includes a (necessarily unique) minimal element
0, such that 0 C y for each y ; or, even further, that ^ is atomized in the
sense that (1) for each y ^ 0, at least one a C y is an atom (i.e., such that
a' C a if and only if a' = 0) ; (2) y\ C y2 if and only if each atom C yx is also
C y2. For other purposes, ^ may be assumed to be inter sectional, i.e., to
include, for any two elements y\ and y2, a maximal element C yi and
C y2 — an intersection, y\ r\ y2.
II. In <& , there is an associative operation, °.
III. ^ includes a bilaterally and absolutely neutral element, v, such that
y o v — y = v o y for any y.
Clearly, v is unique. The connection between C, o, and v is established
in the following postulate that simplifies the author's original development
and is due to Prof. A. Sklar.
IV. y C 6 if there exists an element v' C v such that v' ° 6 = y and if and
only if there exists an element v" C v such that y — 6 ° v".
It readily follows that o is bilaterally monotonic ; that is to say, yi C y2
implies y\ ° y C y2 ° 7 and 7 o yx C y n y2 for any yi, y2, y. If there is a
minimal element, then 0 maybe a bilateral strict annihilator'.
0 o y = 0 = y o 0 for each y.
Moreover, if v\Cv and v2 C v, then vi ° v2 C vi and C v2 ; thus, if ^ is
intersectional, v\ ° v2 C v\ r\ v2.
V. For each y, there exist two unilaterally and relatively neutral ele-
ments, Ly and Ry (the left-neutral and the right-neutral of y) such that:
1 ) Ly o y = y = y ° Ry ;
2) L(yi o y2) C Lyi and R(yi ° y2) C Ry2 for each yi and y2;
3) if ^ C v, then L^w C ^ and R/J C ^.
Clearly, Ly C v and Ry C v for each y. If Ly = y and/or Ry = y, then
y C v. If y C v, then Ly = y = Ry. Hence LLy = RLy = Ly for every y.
Moreover, Ly — 0, Ry — 0, and y = 0 are equivalent. If y C <5, then
Ly C L(5 and Ry C R^. If ^ is an annihilator is the sense that y ° # = L#
and ^ o y = R^ for each y, and if % C v, then # = 0. Moreover, Ly and Ry
are charaterized among the elements C v by the following minimum
property:
458 KARL MENGER
If IJL C v, then p, ° y — y implies Ly C ^, and y ° ft — y implies Ry C ^.
It follows that Ly and Ry are unique for each y. If ° is commutative, then
Ly — Ry for each y.
It will suffice, here, bypassing unilaterally opposite elements, to
postulate finally
VI. For each y, there is a bilaterally opposite element Op y such that
Op y o y C Ry and y ° Op y C Ly for each y,
and which, if one sets Op y o y — R'y and y ° Op y — L'y, has the follow-
ing minimax property:
1 ) if 6 o y C Ry and y ° (5 C Ly, then 6 ° y C R'y and y ° (5 C L'y ;
2) if d o y = R'y and y ° d = L'y, then Op y C (5.
Op y is unique for each y, and L'y o y = y 0 R'y, which might be called
Cy, the core of y. If fj, C v, then Op /u = R'/j = L'/j = C// = ^. For each
atom, Op a is an atom, and Ca = a. Additional assumptions would
guarantee that
Op 6 o Op y C Op(y ° 6) ; Op Op y C y ; Op Op Op y = Op y for each
y, (5. However, y C 6 does not imply Op y C Op d.
An element y of a hypergroup will be called right-elementary (or left-
elementary) if d C y implies Rd C Ry (or Ld C Ly). Each atom is bilater-
ally elementary — briefly, elementary. If ^ is commutative and y is
unilaterally elementary, y is elementary. If ^ is atomized, then Q is
right-elementary if and only if Q * IJL = Q implies ju,Cv. If, in contrast,
x o y C ^ for each y, then /* may be called a left-annihilator ; and each */
that is C x, a leftquasiannihilator. Clearly, x' ° y and y ° x; are left-
quasi-annihilators for any y. If each element of ^ is right-elemen-
tary (or elementary), then ^ will be said to be right-elementary 9 (or
elementary).
With regard to addition as well as multiplication, the set of all functions
is a commutative elementary hypergroup. The universal neutrals are 0
and 1\ the relative neutrals of / are, as it were, vertical projections of /on
0 and 1, respectively; the opposites of / are — / and — .
With regard to substitution, the set of all functions is a (non-commu-
tative) right-elementary hypergroup. The universal neutral is /. The
relative neutrals, R/ and L/, correspond to dom / and ran /, respective-
9 Prof. B. Schweizer proposes to call d a right-neutralizer ofy if yodCv and
Ld C Ry, and to say that d is ( 1 ) maximal if yo 8' C yo 6 for each right-neutralizer 6'
(2) saturating if yod = Ly. One might then postulate that each clement y, on either
side, 'has at least one maximal neutralizer or at least one saturating neutralizcr.
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS 459
ly. 10 Thus the contrast between functions (classes of pairs of numbers) and
their domains and ranges (classes of numbers) disappears. Op / is Inv /.
The left annihilators ^ 0 are what may be called universal constant
functions] the left-quasiannihilators ^ 0 are the constant functions n.
Another example of a hypergroup is the set of all binary relations in
some universal set with regard to what logicians call the relative pro-
duct 12. The universal neutral is the identity relation, while the relative
neutrals again correspond to domains and ranges. Op y is a restriction —
in general, a proper restriction — of the converse of the relation y.
Geometrically, the situation may be interpreted in a set (a "plane"
consisting of "points"} that is decomposed into mutually disjoint subsets
('vertical lines"}. "Simple" sets, i.e., sets having at most one point in
common with each vertical line, are the counterpart of functions. This
vertical simplicity corresponds to right-side elementariness of functions.
Substitution can be illustrated if, secondly, the plane is decomposed into
disjoint subsets ("horizontal lines"} each of which has exactly one point in
common with each vertical line; and if, thirdly, there is given a "diagonal"
set having exactly one point in common with each vertical line as well as
with each horizontal line. The diagonal corresponds to / ; each horizontal
line, to a universal constant function; the vertical (the horizontal)
projection ot a simple set / on the diagonal, to R/ (to L/) ; the points, to
atoms. Fig.l, p. 460, based on the assumption of ordinary vertical and
horizontal lines and a straight diagonal, /, shows a simple plane con-
struction 13 of the result of substituting g into /. For any point a in the set
g, move horizontally to /, then vertically to /, and finally horizontally
back to the vertical line through a. The set of all points thus obtained is
10 In contrast to groups and hypergroups, a Brandt groupoid (Mathematische
Annalcn, vol. 96) only permits the composition of some elements. In ,, categories"
(i.e., essentially, groupoids) of mappings of groups on groups, MacLane calls the
one-side identities of a mapping its domain and range.
11 In a self-explanatory way, one can say that functions constitute a commu-
tative elementary hyperfield with regard to addition and multiplication (with the
multiplicative annihilator 0) and (non-commutative) right-elementary hyperfields
with regard to addition and substitution as well as multiplication and substitution.
The functions may also be said to constitute a trioperational hyperalgebra.
12 Cf., e.g., McKinsey [8] and Tarski [22].
13 Cf. Calculus pp. 89 ff . The traditional postulational theory of binary relations
is inapplicable to functions (cf. 3). On the other hand, the plane construction of
functional substitution here described may, as Prof. M. A. McKiernan observed, be
utilized for binary relations instead of the 3-dimensional construction proposed by
Tarski [22] pp. 78, 79.
460
KARL MENGER
/g. In the figure, / has the shape of an exponential curve ; g, that of — /2 ;
hence /g, that of the probability curve.
Notwithstanding the analogy (brought out in the concept of a hyper-
group) of addition and multiplication with substitution, the latter has a
definite primacy. In an atomized non-commutative hypergroup &, any
binary operation x (such as + and .), defined in the class of all atoms
C v, may be extended to any two elements y' and y" of ^ by defining
y' X y" as the minimum element including all atoms a such that there
Fig. 1
exist two atoms a' C y' and aCy" satisfying the following conditions:
Ra = Ra' = Ra" and La = La' X La".
(This is essentially how the arithmetical operations are extended from
numbers to functions.) Moreover, Neg / and Rec /, the negative and the
reciprocal of/, are obtainable by substituting / into — / and j~l, respective-
ly; and, as will be shown presently, even / + g, /.g, and — can be ob-
/2
tained from 2-place functions S, P, and Q by substitution. In contrast,
14 In fact, no 2-place function yields fg even by substitution of / and g. Cf.
Calculus, p. 304.
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS 461
there are no functions of any kind which, for each / and g, would yield
fg or even Inv / by additions and multiplications. 14 Beyond any question,
in the realm of functions substitution is the operation par excellence.
For any integer m ^ 1 , a class of consistent pairs whose second mem-
bers are real numbers while their first members are sequences of m real
numbers is called, briefly, an m-place function 15 and will be designated by
a capital italic, except where lower case italics emphasize the 1 -place
character of functions such as those treated in what precedes. The class Q
of the pairs ( (xi, X2), ) for all xi and X2 7^ 0 is a 2-place function. So
\ X2 /
are sum and product, S and P, from which, because of their associativity,
an m-place sum and product, 5m and Pmt can be derived for each m > 2.
Of particular importance is, for any two integers 1 < i < k, the i-th
k-place selector function /$*, that is, the class of all pairs ((xi, . . . , x#), Xf)
for any xi, . . . , x#. Clearly, /i1 = /.
There are two main types of substitution of sequences of m functions
into an m-place function (which, for m = 1, coincide with one another
and with the substitution as defined on p. 455) :
a) product substitution: Fm[Gi, . . ., Gw], whose domain is a subset of
the Cartesian product, dom GI X ... X dom Gm, which is the class of all
sequences (yi, . . . , ym) for any yi e dom GI, . . . , ym e dom Gm.
b) intersection substitution: Fm(G\, . . ., Gm), whose domain is
C dom GI r» . . . n dom Gm. Unless GI, . . . , Gm have the same place-
number, that intersection is empty and Fm(Gi, . . . , Gm) = 0; e.g.,
P(jt S) = 0, while P(/i2, 5) and P(/22, S) are non-empty. Clearly,
Q(fi, /2) = — , as defined on p. 455; and Iim(G\, . . ., Gm) = G$ for any
72
m functions of the same placenumber. A simple generalization of the plane
construction described on p. 459 to the 3-dimensional space 16 yields
F*(Gi*, G22).
Traditionally, P(/2, log), P[/2, tog]f P(P) S), P[P, S], P(/22, 5), P[/2, S]
are referred to as the functions x2 log x, x2 log y, xy(x + y), xy(u + v),
y(x + y), x2(y + z), respectively.
Either type of substitution can be extended to a realm of sequences
15 One might introduce numbers as 0-plane functions.
16 Cf. Menger [15] p. 224. Recently, S. Penner in his Master's thesis at Illinois
Institute of Technology has extended the geometric axiomatics of substitution,
outlined on p. 459 of the present paper, from 1 -place to m-place functions in the
m -f- 1 -dimensional space.
462 KARL MENGER
of functions. With each sequence, besides the number of functions
in it, called the sequence-number, a placenumber will be associated.
Either substitution of a second sequence into a first presupposes that
the sequence-number of the second be equal to the place-number of the first.
a) An s-place array of r functions is a sequence such that the sum of
the r places-numbers is s; for instance Ors = [Fiwi, . . . , Frmr], where
s = mi + ... + mr. Product substitution, defined by
Or-1V = [Fi«i[Gi, . . ., CmJ, . . ., Fr^Gs-m^+i, . . . GJ],
clearly is associative and admits unilateral neutrals :
Or»pF,«X,«] - [<VT/]X^ and |/<V = <V - 0rV,
where, for any k > 1, \kk = [/]*, an array of k functions /.
b) An s-place throw of r functions is a sequence such that all r place
numbers are s; for instance, Frs = (Fis, . . . , Frs) . Intersection sub-
stitution, defined by
FfGf = (*V(GV, .... GJ) Fr*(GS, ..., GJ}),
is associative and admits unilateral neutrals. Let Ijf be the k-place throw
of all k-place selector functions in the natural order, and let (/#*) ^denote
the k-placc throw of hk functions forming a chain of h throws Ikk. Then
Fr'(G8*Ht*) = (FrsGs')Htu and IrrFr8 = Frs = Frslss.
By mean sof intersection substitution, the array Orrs and the throw Frs
with the same components FI*, . . . , Frs can be connected : Frs=<brrs(Iss)r.
Commutativity and associativity of addition and the distributive law
can be expressed in the formulae :
S = S(/a», /!«) ; S[j, S] = S[S, /] ; P[S, j] = S[P, FK/A /33, /23, /33).
The existence of right-neutralsK has the following simple
Corollary. Every non-empty function of any number of places lends
itself to substitutions (of both types) with non-empty results.
For any k > 1, the k-place throws of k functions form a hypergroup
by intersection substitution. More generally, throws as well as arrays of
functions constitute what might be called hypergroupoids — a concept
that will be studied elsewhere.
Both types ol substitution can be extended to n-ary relations. For in-
stance, if P is a class of sequences of n + 1 elements; and if HI, . . . , IIn
are classes of (not necessarily consistent) ordered pairs, then P[IIi, . . .,
FU] will denote the class of all sequences (ai, . . ., a», y) such that for
some./?i, . . ., f}n:
AN AXIOMATIX THEORY OF FUNCTIONS AND FLUENTS 463
(ai, Pi) e Hi, .... («„ /»,) e nn and (/?! /»., y) 6 P.
In what precedes, only raf/ functions have been considered, but all statements
(including the following remarks) remain valid if one selects a ring (or, where
division is involved, a field) and writes element of the ring (the field) instead of
real number.
The definitions of arithmetical operations for functions (addition, etc.)
merely presuppose classes of consistent pairs whose second members are
real numbers. The nature of the first members plays no role. Operating on
functions with disjoint domains, however, yields 0; for instance, /-2+
log = 0 and / . 5 = 0. Hence, for some results in a class of functions to be
non-empty, it is necessary that some domains be non-disjoint 17. With this
proviso, the arithmetical operations may be extended to what I will call
functors— classes of consistent quantities, if quantity is any ordered pair
whose second member is a number 18. Of course only functors whose
domains consist of mathematical entities are objects of pure mathe-
matics. Mathematical functors that arc not functions have been called
functionals ; e.g., the class/(J of all pairs (/, /J /) for any integrable function /.
Substitution presents an altogether different situation. If the result
/i/2 is non-empty it is so because the first member of the pair (y, z) e /i is
the second member in a pair (x, y) e /2 ; in other words, because functions
are classes ol pairs whose first and second members are of like nature 19. A
similar reason accounts for substitutions of sequences of functions into
functions of several places. In view of the corollary on p. 462, the only
junctors that lend themselves to substitution with some non-empty results are
the f^lnct^ons. Calling every class of consistent quantities a "function"
(which has been proposed) thus epitomizes overemphasis on addition and
multiplication as well as supreme disregard for the paramount operation
in the realm of functions — substitution.
II. FLUENTS
The objects of science and geometry to which Newton referred as fluents
and which he and his successors have treated with supreme virtuosity
17 Functions of the same place-number, and even throws, satisfy this condition,
and actually lend themselves to meaningful addition and multiplication.
is Cf. Calculus, Chapter VII.
19 What that common nature of the elements is plays no role in the definition of
substitution. For any set 5, one may consider classes of consistent pairs of elements
of S (self -mappings of 5) and define substitution. Examples include w-place throws
of n functions.
464 KARL MENGER
have not, in the classical literature, ever been defined either explicitly or,
by postulates, implicitly. There are of course scientific procedures de-
termining, for instance, pyo, the gas pressure in atm. of a specific in-
stantaneous gas sample yo, corresponding to arithmetical definitions of
log 2. But the function log (even though its definition on p. 454 presup-
poses the understanding of log x for any x) must be distinguished from the
numbers log x as well as from the class ran log. Similarly, p — in the
sequel, fluents as well as 1 -place functions will be designated by lower
case italics — must be distinguished from the numbers py as well as from
ran p (the class of all those numbers) . The fluent p is the class of all pairs
(y, py} for any instantaneous gas sample y.
Besides this (as it were, objective) pressure p, there is, for any ob-
server A, a fluent pA, the gas pressure in atm. observed by A, which is the
class of all pairs (a, />A«) for any act a of A's reading a pressure gauge
calibrated in atm., where p\ct. denotes the number — the pure number, say,
1.5 — read by A as the result of a.
Thus extramathematical features (such as "denomination" and "dimension")
that are often attributed to the values of p and />A are, as it were, absorbed
in the definitions of these fluents. Their values being pure numbers, also
ran p and ran p& are objects of pure mathematics. In contrast, dom p and
dom p\ and, therefore, p and p\ themselves arc extramathematical objects. The
definition of cin entire fluent adds to the knowledge of its values the idea of a
class — a class that is highly significant in some physical laws and, in fact,
indispensable if intuitive understanding (however efficient) of those laws is to
crystallize in articulate formulations.
Differentiation between p, on the one hand, and the numbers p y or the
class ran p, on the other, however slight the difference may appear, is at
variance with the entire traditional literature on fluents inasmuch as the
latter is at all articulate. McKinsey, Sugar, and Suppes 20 introduce time
as a class of numbers (clock readings) and Artin 21 takes a similar position
(whereas, from the point of view here expounded, t^, for an observer A,
is the class of all pairs (r, IAT) for any act T of clock reading performed by
A). Courant says explicitly 22 that Boyle's law deals only with the values
ol p and v and not with those quantities themselves. All that physics
supplies, he emphasizes, are the classes of values of p and v.
In fact, Courant mentions p as an example of a variable (a symbol that
20 Cf. McKinsey, Sugar and Suppes [9].
21 Cf. Artin [\i p. 70.
22 Cf. Courant [3], p. 16.
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS 465
may be replaced with the designation of any element of a class of num-
bers), thereby illustrating another error pervading the traditional
literature : the identification of fluents with what herein is called number
variables, and the indiscriminate use of the term variable as well as the
same (italic) type for both.
Yet — and this is a mere hint of the actual gulf separating the two — number
variables may be interchanged, whereas fluents (e.g., abscissa and ordinate
along a curve in a Cartesian plane, x being the class of all pairs (n, xn) for any
point n on the curve) must not. For instance, the class of all (x, y) such that
y = x2 is the same as the class of all (y, x) such that x = y2, whereas the
parabola y = x* and the parabola x = y2 are different curves.
The confusion is enhanced by the use of the term variable, thirdly, for symbols
that are replaceable with the designations of any element of some well-defined
class of fluents or of classes of consistent quantities — in other words, for fluent
variables or c.c.q. variables; e.g., for u in the statement — = cos u for any
du
c.c.q. u that is continuous on (the limit class) dom u. Here, u may be replaced
with the designation of the time 23 or the abscissa or even a continuous
functional (as is /Ol in the realm of continuous functions whose limit is defined
by uniform convergence), but nor with the designation of a number. One has
d sin t d sin x d sin [^ y.1
= cos t, = cos x and even ~— — cos f£,
dsin\
whereas = cos 1 is nonsense.
d\
The literature also contains allusions to fluents that avoid confusing
them with either classes of numbers or number variables. But those
allusions (usually to "variable numbers") are inarticulate beyond re-
cognition. For instance Russell 24, Tarski 25, and other logicians in dis-
cussing number variables have repeatedly criticized the misconception
of numbers that are capable of various values; and indeed, there are no
numbers that are both 0 and 1 , nor, as some one put it, numbers that have
different values on weekdays and on Sundays. What logicians seem to
overlook, however, is the fact that many obscure allusions to "variable
numbers" do not refer to number variables in the logico-mathematical
23 Strictly speaking, the domain of a fluent is not a limit class. In a model,
however, according to the concluding remarks of the present paper, / and s may be
assumed to be continuous classes of consistent quantities on domains that are limit
classes. Cf. Calculus, pp. 220-225.
24 Cf. Russell [20], p. 90.
25 Cf. Tarski [21], pp. 3, 4.
466 KARL MENGER
sense, but rather represent utterly confused references to Newton's
fluents. A fluent (without of course being a variable number) may indeed
assume both the value 1 and the value 0. In fact, it may (as does, e.g., the
admission fee in $ to certain art galleries) assume the value 1 on weekdays
and the value 0 on Sundays.
In the broadest sense, a fluent may be defined as a class of consistent
quantities with an extramathematical domain — the c.'s c.q. with mathe-
matical domains being functions and functionals. Fluents such as the
class h of all pairs (F, h¥) for any Frenchman F, where hF is F's height in
cm. (studied in biology and sociology), are sometimes called variates]
their domains, populations.
Clearly, not every quantity, as defined on p. 463, is interesting', nor is every
fluent significant, even if its elements are interesting quantities — think of the
union of the height in the population of France and the weight in the population
of Italy. Nor, for that matter, is every function and every functional important.
While the general theory, of course, provides the scheme for handling all
fluents, it is up to the individual investigator to apply it to some of the countless
cases that are theoretically or practically significant.
Some critics of the theory here expounded have suggested that its basic idea,
the concept of fluent, has always been known, viz., under the name of "real
function" and, moreover, follows the pattern of Kolmogoroff's well-established
concept of random variables — r.v.'s. Besides overextcnding the use of the
term function (see p. 463), those critics seem to overlook: (1) that what is
essential in the theory is the f cumulation of definitions for the (heretofore
only intuitively used) concepts that Newton called fluents — definitions that
are at variance with their traditional treatment, which ignores classes of pairs
altogether (see p. 464); (2) that scientific fluents and r.v.'s lack one another's
very characteristics and are, if anything, complementary rather than parallel
concepts. 26 If /I is a physical die, then the (extramathematical] class tA of all
pairs (d, td] for any act d of rolling J is an experimental fluent but not a Y.V. — not
even if an additive functional ("probability") is defined for the 26 subsets of
ran tA = {1, . . ., 6} (i.e., the class 5 of all possible outcomes of rolling A). On
the other hand, in presence of such a probability functional on the subsets of
S, any (purely mathematical] function having S as domain is a r.v. but not a
scientific fluent; e.g., the function / for which /(I) = v/ 7, /(2) = n -{- e,
/(6) = cos 2 -\- log 5. By their definitions, r.v.'s lack connections with ex-
periment and observation. Again, scientific fluents such as tA, gas pressure,
and time lack the characteristic of r.v.'s, since the definition of a reasonable
probability on subsets of their domains is completely out of the question.
(What should be the probability of an act of rolling A, or of a gas sample or of
an act of clock reading ? Only in the range of a scientific fluent can one define
frequency, relative frequency and, perhaps, probability.) (3) That even some-
26 Cf. Menger [17], pp. 222-223.
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS 467
one referring to all functors as "functions" cannot escape the use of a special
term (say, "functions in the strict sense") referring to those functors whose
domains consist of numbers or sequences of numbers. For (because of their
substitutive properties, not shared by any other functors) these functors play a
special role, and therefore are omnipresent in science as well as in mathematics.
While in the light of the conceptual clarifications, terminological questions are
quite insignificant, it does seem most appropriate to call fluents what Newton
called fluents, and functions, what Leibniz called functions.
The union of non-identical fluents with the same domain is not a fluent.
From a set-theoretical point of view, fluents do not constitute a Boolean
algebra. One of the few positive formal properties of fluents is the possi-
bility of substituting them into 1 -place functions: log p is the class of
all pairs (y, log py) for any sample y — a definition analogous to that of
log cos. But while also the cosine of the logarithm is a c.c.q. ^ 0t the
pressure of the logarithm is empty. Every function permits some non-
empty substitutions, whereas a fluent (like a functional) permits none.
Attempts have been made to dodge the problem of articulately connecting
various fluents by defining some of them as functions of others 27. Yet, even
if in a gallery a sign declares that admission costs % 1 on weekdays and is free
on Sundays, the concept of admission fee cannot very well be said to be de-
fined as a function of the time. Someone unfamiliar with that concept will not
grasp it by reading the sign while, on the other hand, the concept is compre-
hensible to persons ignorant of the days of the week. Actually, admission fee
might (for operative purposes) be defined as the class a of all pairs (A, aA) for
any act of admitting a visitor, where a\ is the amount in $ charged during A.
The sign, comprehensible only to those who know a and t, stipulates how the
two are connected.
By substitutions into 2-place functions 5, P, etc., significant addition,
multiplication etc. of fluents can be defined, provided that their domains
are non-disjoint — the only condition for arithmetical operations on
c.'sc.q. to be non-empty (see p. 463). For instance, P(p, v), the result of
intersection substitution of p and v (whose common domain is the class
of all y) is p.v. But a slight change in the point of view raises difficulties.
What, in view of the fact that dom pA and dom VA consists of acts of
different (manometric and volumetric) observations, is the meaning of
PA-VA? Since Boyle, it has become traditional to associate with that
symbol (if only intuitively, i.e., without explicit definitions) the class
((n, ft), p AM -V Aft) for any two simultaneous acts n and ft that A directs to
27 Cf. the references in footnotes 20 and 21.
468 KARL MENGER
the same object; thus />A-^A denotes the restriction of P[pA, *>A] to the
class .Tof all pairs of simultaneous and co-objective acts e dom p& x dom v&>
It thus appears that in operating on fluents, besides referring to the ele-
ments of their several domains, one may well have to relativize the oper-
ations to certain pairings of those domains. Such relativizations are
imperative in formulating — articulately formulating — relations be-
tween fluents.
III. RELATIVE CONNECTIONS OF FLUENTS BY FUNCTIONS
Consider Boyle's law for gas undergoing an isothermal process — in
proper units, v = — . If all that physics supplied were the values of v and
P
p or the classes of those values, then Boyle might have discovered his
law upon being presented with a bag containing cards each indicating a
value of p, and another bag informing him of the values of v. But why, in
that situation, should Boyle have paired each number in the first bag just
with its reciprocal in the second rather than, say, with its square root?
As a matter of fact, Boyle did not primarily pair numbers at all. Pairing
numbers is what mathematicians do in defining functions. What Boyle
actually paired were observations pertaining to the same object; and he
discovered that
(3) wy = for any inst. gas sample y at the fixed temperature.
Pr
This statement is comparable to
(4) cot x = for any number x that is not a multiple of n/2,
with v and p corresponding to cot and tan\ and the sample variable y, to
the number variable x.
Unfortunately, the classical literature has done all that was possible to conceal
the existing analogies. Besides, it has simulated a parallelism between v, p and
x by indiscriminately referring to them as "variables" and using the same
(italic) type for all of them (whereas the functions are usually denoted by cot
and tan). In an attempt to mask the confusion between fluents and number
variables, a contradiction in terms comparable to "enslaved freeman" was
coined: "dependent variable". Finally, the true analogues of v = \/p, formulae
such as cot = \jtan (connecting two functions just as Boyle's law connects two
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS 469
fluents) are anathema, and only the corresponding statements about numbers,
such as (4) are admitted. 28
For an observer A, Boyle's law takes the form
(5) VAJ# — - for any two acts (n, /?) e F.
Relativizing connections of two fluents to a class F of pairs of simultane-
ous co-objective acts is very natural though not logically cogent. At any
rate, since Galileo and Boyle, such (tacitly understood) relativizations
have become second nature to physicists, who have transplanted them,
as matters of course, even to quantum mechanics — a field where they
are rather problematic. In v = \/p, the pairing is altogether hidden.
On the level of general statements about fluents, however, the need for
explicit relativizations is evident. The question "Is w = \/u?" for any
two fluents is incomplete. Certainly it does not necessarily refer to the
entire class dom u x dom w ; that is to say, it does not necessarily mean
"Is each value of w the reciprocal of each value of w?" In this sense, for an
affirmative answer it would be necessary that both u and w were constant
fluents. The question thus must refer to some subset of dom u x dom w.
But to which subset ? No particular subset of the Cartesian product of any
two (especially disjoint) sets is or can be "natural". The intended subset
must be specified. Such a relativization is necessary in order to make the
question complete.
In the broadest sense, the connection of a class of consistent quantities
w with another c.c.q. v relative to a set II C dom u X dom w by the func-
tion / is described in the following basic definition:
w = /«(rel. II) if and only if (a, ft) E U implies wfi = fuat,.
Here, the consequent might be replaced with : (HOC, wfi) e /. For instance,
(3) results if II is the class I of all pairs (y , y) . The connection of functions
by functions in traditional analysis is relative to restrictions of j. If /' is
the restriction of / to numbers that are not multiples of n/2, then (4)
subsumes under the general scheme:
cot = 7"1 tan (rel. /') since cot y = j~l tan x for any (x, y) e /'.
28 It is not unusual to write, e.g. : if /= 1/g, then g = I// for any two functions /
and g (thus dispensing with number variables). But, in violation of automatic
substitutive procedures, the function variables / and g are replaced, e.g., by tan x
and cot x, and not in the traditional literature by tan and cot.
470 KARL MENGER
Clearly, w = fu (rel. II) implies u = Inv fw (rel. conv. II) ; and
w = fv(rel. IT) and v = gu (rel. P) imply w = fgu (rel. HP).
It is now clear why functions have been defined as on p. 454, and "multi-
valued" functions have been strictly excluded. If the latter were admitted,
then, relative to every pairing, every fluent would be a function of every other
fluent. The question "Is w a function of u rel. 77?", which is so important in
science (e.g., in thermodynamics), would be deprived of any meaning. How-
ever, for any 2-placc function F, one may define:
(6) F(u, w) = 0 (rel. II) if and only if (a, 0)eII implies F(UOL, w@) = 0.
Of course only if F(ua, wp) ^ 0 for some (a, p)E dom u x dom w (especially, if
F ^ 0) does (6) establish a connection between u and w.
The most general connection of a functor w with n functors v\, . . ., vn
relative to P C dom v\ x ... X dom vn X dom w by the n-place func-
tion G is given by:
w = G[VI, . . . , vn] (rel. P) if and only if
(0i, . . . , pn, y) e P implies wy = G(vi0i, . . . , Vnpn)>
The chain rule reads as follows :
w — G[VI, . . ., Vn\ (rel. P) and vt = Fi[uiti, . . ., ui>m] (rel. Hi) imply
w = G[Fi, . . ., FJ[«i,i, . ..,un.mn] (rel. P[lli, . . ., nw]).
The rate of change of w with, say, vn rel. P (keeping v±, . . . , vn-\ un-
changed) is a fluent with the domain P, which must not be confused with
the n-th place partial derivative DnG, which is an n-place function with a
domain C dom G. While the two symbols are frequently misrepresented
as synonyms, the concepts are connected 29 by the formula :
dw \
-T— ) =DnG(vlf ...,»„) (rel. P).
ton'Vl, -•> Vn-l
But it is important to note that the rate of change of w with vn rel. P may
well exist without w being a function of v\t . . ., vn rel. P. An analogous
distinction is necessary between the cumulation of w with vn and the n-th
place partial integral of G.
From the preceding exposition of the material, based on explicit
definitions, there emerge the outlines of its axiomatic treatment. A group
29. Cf. Calculus, Chapter XI, especially pp. 306-315 and 332-341 and Menger [16].
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS 471
I of postulates has to be devoted to partial order in a realm of undefined
entities (called n-ary relations) in which there are two operations, inter-
section and Cartesian multiplication, subject to postulates of group II.
In terms of these operations, associative substitutions are introduced
(group III). Union of relations plays a small role, if any, and certainly
none in that important subclass of relations whose elements are called
classes of consistent pairs (group IV), because in the realm of c.'s c.p.
union cannot in general be defined. Of particular significance among
c.'s c.p. are selector and identity relations (group V) which, as has been
illustrated in the realm of the 1 -place functions, play the roles of domains
of c.'s c.p. At this point, the class of all real numbers (or, if one pleases,
a field or ring) enters the picture. By means of it, consistent classes of
quantities or functors can be singled out (group VI) and, among them,
functions constituting a hypergroupoid. Selector relations that are func-
tions are the all-important selector functions, including the identity
function /. What precedes is a basis for treating the connection of one
functor with n other functors by means of an n-place function relative to
an n -f 1-ary relation between their domains, as well as a functional
interrelation of m functors relative to an m-ary relation.
Clearly, such an axiomatic theory represents the most general treat-
ment of models in the sense in which this term is used in science, especially,
in social sciences. An analogy appears with postulational geometry, which
deals with undefined elements, called points and lines for the sake of a
suggestive terminology, while all that is assumed about them is that they
satisfy certain assumptions. Subsequently, they are compared with ob-
servable objects, e.g., in the astronomical space, with cross hairs and light
rays. Models are formulated in terms of functor variables — undefined
classes of consistent quantities, called, say, time and position or pressure
and volume and denoted by t and s or p and v, for the sake of a suggestive
terminology, while all that is assumed about them is that, relative to
undefined pairings of their domains, those functors are interrelated by
certain functions. Subsequently, an observer A compares them with
observed fluents (t& and SA or p\ and VA) relative to specified pairings of
the domains of the latter. He trusts that, within certain limits of accuracy,
the statements concerning the undefined functors in the model will be
verified by known connections between the observed fluents — some of
them, he hopes, by previously unknown connections 30.
30 The ideas here outlined seem to supplement the existing theory on concept
formation in empirical science; cf. Carnap [2] and Hempel [5].
472 KARL MENGER
As far as the general theory of fluents is concerned, the prediction may
be ventured that indiscriminate uses of the term "variable" and of
nondescript letters x will give way to more careful distinctions; and that
references to domains of fluents as well as to pairings of those domains,
once introduced, will be permanently incorporated in the articulate
formulations of scientific laws.
Acknowledgements
The author is grateful to Professors M. A. McKiernan, B. Schweizer, and A.
Sklar for valuable suggestions in connection with this paper, and to the Carnegie
Corporation of New York for making it possible to devote time to the development
of the material.
Bibliography
[1] ARTIN, E., Calculus and Analytic Geometry. Charlottesville 1957, 126 pp.
[2] CARNAP, R., The methodogical character of theoretical concepts. In Minnesota
Studies in Philosophy of Science, vol. 1, Minneapolis 1956.
[3] COURANT, R., Differential and Integral Calculus, vol. 1,
[4] HELLER, I. , On generalized polynomials. Reports of a Mathematical Colloquium
2nd. ser., issue 8 (1947), pp. 58-60.
[5] HEMPEL, C. G., Fundamentals of concept formation in the empirical sciences.
International Encyclopedia of Unified Science, vol. 2 no. 7 Chicago 1952.
[6] MCKIERNAN, M. A., Les series d'iterateurs et leurs applications aux equations
fonctionelles. Comptes Rcndus Paris, vol. 246 (1958), pp. 2331-2334.
[7] , Le prolongement analytique des series d'iterateurs. Comptes Rendus
Paris, vol. 246 (1958), pp. 2564-2567.
[8] McKiNSEY, J. C. C., Postulates for the calculus of binary relations. Journal of
Symbolic Logic, vol. 5 (1940), pp. 85-97.
[9] , SUGAR, A. C. and P. SUPPES, Axiomatic Foundations of classical particle
mechanics. Journal of Rational Mechanics and Analysis, vol. 2 (1953), pp.
253-272.
[10] MENGER, K., Calculus. A Modern Approach. Boston 1955, XVIII -f 354 pp.
[1 1] , The ideas of variable and function. Proceedings of the National Academy,
U.S.A., vol. 39 (1953) pp. 956-961.
[12] , New approach to teaching intermediate mathematics. Science, vol. 127
(1958) pp. 1320-1323.
[13] t Algebra of Analysis. Notre Dame Mathematical Lectures, vol. 3, 1944.
50 pp.
[14] , Tri-operational algebra. Reports of a Mathematical Colloquium, 2nd
series, issue 5-6 (1945) pp. 3-10 and issue 7 (1946) pp. 46-60.
[15] , Calculus. A Modern Approach. Mimeographed Edition, Chicago 1952,
XXV + 255 pp.
AN AXIOMATIC THEORY OF FUNCTIONS AND FLUENTS 473
[16] , Rates of change and derivatives. Fundamenta Mathematicae, vol.
46 (1958), pp. 89-102.
[17] , Random variables and the general theory of variables. Proceedings of the
3rd Berkeley Symposium on Mathematical Statistics and Probability, vol. 2,
Berkeley 1956, pp. 215-229.
[18] MILGRAM, A. N., Saturated polynomials. Reports of a Mathematical Collo-
quium, 2nd series, issue 7 (1946), pp. 65-67.
[19] NOBAUER, W., Ober die Operation des Einsetzens in Polynomnngen. Mathe-
matische Annalen vol. 134 (1958) pp. 248-259.
[20] RUSSELL, B., The Principles of Mathematics. Vol. 1. Cambridge 1903, XXIX
+ 534 pp.
[21] TARSKI, A., Introduction to Logic. New York 1941, XVIII -f 239 pp.
[22] , On the calculus of relations. Journal of Symbolic Logic, vol. 6 (1941) pp.
73-89.
Symposium on the Axiomatic Method
AXIOMATICS AND THE DEVELOPMENT OF CREATIVE TALENT
R. L. WILDER
University of Michigan, Ann Arbor, Michigan, U.S.A.
Introduction. Perhaps I should apologize for presenting here a paper
that embodies no new results of research in axiomatics. However, for some
time I have felt that someone should record a description of an important
method of teaching based on the axiomatic method, and this conference
seems an appropriate place for it.
Actually, I can point to an excellent precedent in that the late E. H.
Moore devoted most of his retiring address [2], as president of the Amer-
ican Mathematical Society, to a study of the role of the then rapidly
developing abstract character of pure mathematics, especially the in-
creasing use of axiomatics, in the teaching of mathematics in the primary
and secondary schools. Just how much influence E. H. Moore's ideas had
on the later developments in elementary mathematical education in this
country, I do not know. It is perhaps significant that the increasing
concern with these matters on the part of a large section of the member-
ship of the American Mathematical Society (particularly in the Chicago
Section) led, several years later, to the forming of a new organization,
the Mathematical Association of America, whose special concern was with
the teaching of mathematics in the undergraduate colleges. 3
Historical Development of the Method. We have heard a great deal, the
past fifty years or so, of the use of the axiomatic method as a tool for
research. Indeed, this use of the method has been justly considered as one
of the most outstanding and surprising phenomena in the evolution of
modern mathematics. Scarcely a half century ago, so great a mathe-
matician as Poincare could devote, in an article entitled The Future
of Mathematics [6], less than half a page to the axiomatic method. And
although conceding the brilliance of Hilbert's use of the method, he
predicted that the problem of providing axiomatic foundations for
various fields of mathematics would be very "restricted", and that "there
would be nothing more to do when the inventory should be ended, which
1 S.ee [1], parts VII and XV 6, but especially p. 81 and p. 146.
474
AXIOMATICS AND THE DEVELOPMENT OF CREATIVE TALENT 475
could not take long. But when", he continued, "we shall have enumerated
all, there will be many ways of classifying all; a good librarian always
finds something to do, and each new classification will be instructive for
the philosopher."
As recently as 1931, Hermann Weyl matched the contempt veiled in
these remarks by a fear expressed as follows: "—I should not pass over in
silence the fact that today the feeling among mathematicians is beginning
to spread that the fertility of these abstracting methods [as embodied in
axiomatics] is approaching exhaustion. The case is this: that all these
nice general notions do not fall into our laps by themselves. But definite
concrete problems were conquered in their undivided complexity, single-
handed by brute force, so to speak. Only afterwards the axiomaticians
came along and stated: Instead of breaking in the door with all your
might and bruising your hands, you should have constructed such and
such a key of skill, and by it you would have been able to open the door
quite smoothly. But they can construct the key only because they are
able, after the breaking in was successful, to study the lock from within
and without. Before you can generalize, formalize and axiomatize, there
must be a mathematical substance. I think that the mathematical
substance in the formalizing of which we have trained ourselves during
the last decades, becomes gradually exhausted. And so I foresee that the
generation now rising will have a hard time in mathematics." 2
Evidently mathematical genius does not correlate well with the gift of
prophecy, since neither Poincare's disdain nor Weyl's fears have been
justified. Neither of these eminent gentlemen seems to have realized that
a powerful creative tool was being developed in the new uses of the
axiomatic method. It was Weyl's good fortune to live to see and ac-
knowledge the triumphs of the method. And undoubtedly had Poincarc
lived to observe how the method contributed to the progress of mathe-
matics, he would gladly have admitted his prophetical shortcomings. It is
easy to comprehend why they felt as they did, and as, conceivably, a
majority of their colleagues felt. For until quite recent years, the method
had achieved its most notable successes in geometry, where axiom
systems often served as suitable embalming devices in which to wrap up
theories already worked out and in a stage of decline. The value of the
method as a tool for opening up vast new domains for mathematical
2 Quoted from H. Weyl [7]. It is to Weyl's credit that he acknowledges, in this
connection, the brilliant results obtained by Emmy Noether by her pioneering use
of the axiomatic method in algebra.
476 R. L. WILDER
investigation, as it has done in algebra and topology for example, was not
yet sufficiently exemplified to make an impression on the mathematical
public. Peano's fundamental researches in logic and number theory were
concealed in his unique "pasigraphy" ; and besides, was not this again a
case of wrapping old facts in new dress (mused the uncomprehending
analyst) ? Similarly Grassmann's earlier work in his (now justly ap-
preciated) Ausdehnungslehre was concealed in a mass of philosophical
obscurities, and moreover the philosophy of the time was dominated by a
Kantian intuitionism not receptive to the idea of mathematics as a
science of formal structures.
Nevertheless, the evolution of modern mathematics was proceeding in
a direction which made inevitable those uses of axiomatics with which
every modern mathematician is now familiar. Noone, among the mathe-
maticians active around the turn of the century, appears to have been
more aware of this trend than the American mathematician E. H. Moore.
Moore's interest in, and use of, axiomatic procedures is well known, and I
have already remarked on his interest in the influence which they might
have on the teaching of elementary mathematics. Of importance for my
purposes is the influence of his ideas on a group of young mathematicians
who were under his tutelage at the time, particularly R. L. Moore and
O. Veblen. Both Veblen and R. L. Moore wrote their doctoral dissertations
in the axiomatic foundations of geometry. And the interests of both soon
turned to what was at the time a new branch of geometry in which metric
ideas play no official role, viz. topology, or as it was then called, analysis
situs.
It is an interesting fact, however, that the topological interests of the
two diverged, the one, Veblen, following the line initiated by Poincare
and subsequently called "combinatorial topology", the other, R. L.
Moore, following the line stemming from the work of Cantor and Schoen-
flies and subsequently called "set-theoretic topology". And whereas the
latter, set-theoretic topology, lent itself naturally to the axiomatic
approach which Moore continued to develop, the former, combinatorial
topology, was not left by Poincare (whose feelings toward the axiomatic
method we have already indicated above) in a form suitable to axiomatic
development.
The first major work of R. L. Moore in "analysis situs" [3], was publish-
ed in 1916. 3 It embodied a set of axioms characterizing the analysis
3 There are three axiom systems given in this paper. In our remarks we refer
only to. that one which is designated in [3] by the symbol "2V.
AXIOMATICS AND THE DEVELOPMENT OF CREATIVE TALENT 477
situs of the euclidean plane. In a later paper [4], Moore showed this
axiom system to be categorical, and still later [5] applied it in a way
prophetic of the new, creative uses of the axiomatic method soon to
come into vogue.
However, of much more importance for my present purposes, was the
manner in which Moore 4 used his axiom system for plane analysis situs
for discovering and developing creative talent. Those of us who are
accustomed to the use of axioms in constructing new theories, or for other
technical creative purposes, may have lost sight of the fact that the
axiomatic method can serve as the basis for a most useful teaching
device.
I am not referring to the traditional use of axioms in teaching high
school geometry of the euclidean type. Although here, in the hands of an
inspired teacher, the method can and sometimes undoubtedly does turn
up potential mathematicans, most of the teaching of high school geometry
seems to be of two kinds. Either it is based on the use of a standard text
book in which the theorems are all worked out in detail for study by the
pupil, with a supply of minor problems — so-called "originals" — to be
done by the pupil and geared usually to the ability of the "average"
student ; or it is carried out in connection with a laboratory process which
is supposed to exemplify the so-called "reality" of the theorems proved,
therely preventing the abstract character of the system from becoming
too dominant. In short, the whole process may be considered overly
adapted to the capacities of the "average" student and consequently
generally loses — perhaps justifiably — its potentiality for developing
the mathematical talents of the more gifted student.
Nor am I referring to the fact that quite commonly, in our graduate
courses in algebra and topology, we use the axiomatic method for setting
up abstract systems. I mean something more than this. What I mean can
perhaps be indicated by a remark which one of my former students made
to me in a recent letter: "I am having quite good success teaching a
course, called Foundations of Analysis, by the Moore-Socrates method."
The use by Moore of the axioms for plane analysis situs in his teaching had
many elements in common with the Socratic method as revealed in the
"Dialogues", especially in the general type of interplay between master
and pupil.
Moore proceeded thusly: He set up a course which he called "Foun-
dations of Mathematics", and admitted to attendance in the course only
4 From here on, by "Moore" I shall mean R. L. Moore.
478 R. L. WILDER
such students as he considered mature enough and sufficiently sympathetic
with the aims of the course to profit thereby. It was not, then, a required
course, nor was it open to any and all students who wanted to "learn
something about" Moore's work. He based his selection of students,
from those applying for admission, on either previous contacts (usually
in prior courses) or (in the case of students newly arrived on the campus)
on analysis via personal interview — usually the former (that is, previous
contacts). The amazing success of the course was no doubt in some
measure due to this selection process.
He started the course with an informal lecture in which he supplied
some explanation of the role to be played by the undefined terms and
axioms. But he gave very little intuitive material — in fact only meagre
indication of what "point" and "region" (the undefined terms) might
refer to in the possible interpretations of the axioms. He might take a
piece of paper, tear off a small section, and remark "Maybe that's a
region". However, as the course progressed, more intuitive material was
introduced, oftentimes by means of figures or designs set up by the
students themselves.
The axioms were eight 5 in number, but of these he gave only two or
three to start with; enough to prove the first few theorems. The re-
maining axioms would be introduced as their need became evident. He
also stated, without proof, the first few theorems, and asked the class to
prepare proofs of them for the next session.
In the second meeting of the class the fun usually began. A proof of
Theorem 1 would be called for by asking for volunteers. If a valid proof
was given, another proof different from the first might be offered. In any
case, the chances were favorable that in the course of demonstrating one
of the theorems that had been assigned, someone would use faulty logic
or appeal to a hastily built-up intuition that was not substantiated by the
axioms.
I shall not bore you with all the details; you can use your imaginations,
if you will, regarding the subsequent course of events. Suffice it to say
that the course continued to run in this way, with Moore supplying
theorems (and further axioms as needed) and the class supplying proofs.
I could give you many interesting — and amusing — accounts of the
byplay between teacher and students, as well as between the students
themselves; good-natured "heckling" was encouraged. However, the
point to be emphasized is that Moore put the students entirely on their own
5 One of these was later shown (by the present author [8]) not to be independent.
AXIOMATICS AND THE DEVELOPMENT OF CREATIVE TALENT 479
resources so far as supplying proofs was concerned. Moreover, there was no
attempt to cater to the capacities of the "average" student; rather was
the pace set by the most talented in the class.
Now I grant that there seems to be nothing sensational about this.
Surely others have independently initiated some such scheme of teaching.6
The noteworthy fact about Moore's work is that he began finding the
capacity for mathematical creativeness where no one suspected it ex-
isted! In short, he found and developed creative talent. I think there is no
question but that this was in large measure due to the fact that the
student felt that he was being "let in" on the management and handling of
the material. He was afforded a chance to experience the thrill of creating
mathematical concepts and to glimpse the inherent beauty of mathematics,
without having any of the rigor omitted in order to ease the process. And
in their turn, when they went forth to become teachers, these students
later used a similar scheme. True, they met with varying success — after
all, a pedagogical system, no matter how well conceived, must be operated
by a good teacher. Their success was striking enough, however, that one
began to hear comments and queries about the "Moore method". And it
is partly in response to these that I am talking about the subject today. It
seems that it is time someone described the method as it really operated,
and perhaps thereby cleared up some of the folklore concerning it.
Description of the Method. In the interest of clarity, I shall arrange my
remarks with reference to certain items which I think, after analyzing
the method, are in considerable measure basic to its success. These items
are as follows:
1. Selection of students capable (as much as one can tell from personal
contacts or history) of coping with the type of material to be studied.
2. Control of the size of the group participating ; from four to eight students
probably the best number.
3. Injection of the proper amount of intuitive material, as an aid in the
construction of proofs.
4. Insistence on rigorous proof, by the students themselves, in accordance
with the ideal type of axiomatic development.
5. Encouragement of a good-natured competition] it can happen that as
many different proofs of a theorem will be given as there are students in
the class.
6 Professor A. Tarski informed me after the reading of this paper that he had used
a somewhat analogous method in one of his courses at the University of Warsaw.
480 R. L. WILDER
6. Emphasis on method, not on subject matter. The amount of subject
matter covered varies with the size of class and the quality of the indi-
vidual students.
I think these six items lie at the heart of the method. Of course they
slight the details; e.g., the manner in which Moore exploited the compe-
tition between students, and the way in which he would encourage a
student who seemed to have the germ of an idea, or put to silence one who
loudly proclaimed the possession of an idea which upon examination
proved vacuous. I imagine that it was in such things as these that Moore
most resembled Socrates. But these are matters closely related to Moore's
personality and capability as a teacher, so I shall confine myself to the
six points enumerated above so far as the description of the method is
concerned. They are, I realize, themselves pedagogic in nature, but more
of the nature of what I might call axiomatic pedagogy. They constitute, I
believe, a guide to the successful use of axiomatics in the development of
creative talent.
I would like to comment further on them :
1 . Selection of students capable of coping with the type of material to be
studied. I have already made some remarks in this regard. I pointed out
that Moore based his judgment regarding maturity either on his ex-
perience with the student in prior courses, or on personal interviews. I
might add, parenthetically, that as the years went by and his students
began to use his methods in their own teaching, a sort of code developed
between them whereby one of the "cognoscenti" would apprise one of his
colleagues in another university of the availability of potentially creative
material. For example, the "pons asinorum" of Moore's original axiom
system was "Theorem 15". If one of Moore's graduates wished to place a
student for further work under the tutelage of another of Moore's students
at a different institution, and could include in his recommendation the
statement, "He proved Theorem 15", then this became a virtual "open
sesame".
But Moore, himself, was not dependent on other institutions ; he found
his students, generally speaking, in the student body of the University
of Texas. He had a singular ability for detecting talent among under-
graduates, and often set his sights on a man long before he was ready for
graduate work. Indeed, in some instances, he would allow in his class in
"Foundations" an undergraduate whom he deemed ready for creative
work. For Moore believed that a man should start his creative work as
AXIOMATICS AND THE DEVELOPMENT OF CREATIVE TALENT 481
soon as possible, and the younger the better. He reasoned that one could
always pick up "breadth" as he progressed. It was not unusual for him to
discover talent in his calculus classes. And once he suspected a man of
having a potentially mathematical mind, he marked that man for the rest
of the course as one with whom he would cross his foils, so to speak. By the
end of the term, he was usually pretty sure of his opinion of the man.
Of course he could not, in the very nature of the case, always be certain.
This applies especially to those who entered his class as graduate students
from other institutions, who had had no previous work with him, and
whom he had to screen usually in a single interview at registration time.
And when a student of little or no talent did slip by, he was doomed to a
semester of either sitting and listening (usually with little comprehension),
or to feverishly taking notes which he hoped to be able to understand by
reading outside of class. In the latter case he was often disappointed, for
as we all know, one's first proof of a theorem is usually not elegant, to
understate the case, and the first proofs of a theorem given in class were
likely to be of this kind. But as I stated before, the aim of the course was
not so much to give certain material — the student who wished the latter
would have been better advised to read a book or to seek out the original
material in journals. I would call these "note-takers" the "casualties" of
the course. So you see it was humane, as well as good strategy, to allow
only the "fit" to enroll in the course.
I might remark, too, that those of us who went from Texas to other
institutions as young instructors, did not usually find it possible to
institute Moore's "exclusion policy" in all its rigor. For various reasons,
we often had to throw our courses open to one and all. This naturally
led to certain modifications, as, for instance, making sure that the "note-
takers" ultimately secured an elegant proof; this seemed the least that
they were entitled to under a system where they were not sufficiently
forewarned of what to expect, and of especial importance if the material
covered was to be used by the student as basic information in later
courses.
2. Control of the size of the group participating] from four to eight
probably the best Dumber. This is obviously not independent of 1., since
Moore's method of selecting students was clearly suited to keeping down
the size of the class. Some of us, however, especially during periods of high
enrollments, have had to cope with classes of as many as 30 students or
more. I can report from experience that even with a class this large, the
482 R. L. WILDER
method can be used. Or course inevitably a few (sometimes only two or
three) students "star in the production*'. I have found, however, that
these "star" students often profited from having such a large audience
as was afforded by the "non-active" portion of the class. Often the "non-
stars" came up with some good questions and sometimes — rarely to be
sure — with a suggestion that led to startling consequences.
In short, although from four to eight is the ideal size of class for the use
of the axiomatic method, it is not impossible to handle classes of as many
as 30 while using the method.
3. Injection of the proper amount of intuitive material, as an aid in the
construction of proofs. This, I hardly need emphasize, must be handled
carefully. With no intuitive background at all, the student has little
upon which to fix his imagination. The undefined terms and the axioms
become truly meaningless, and a mental block perhaps ensues. Here the
instructor must exercise real ingenuity, striving to furnish that amount of
intuitive sense that will be sufficient to suggest processes of proof, while
at the same time holding the student to the axiomatic basis as a founda-
tion for all assertions of the proof.
I have always been interested, in my use of the axiomatic method in
Topology, in observing the degree to which the various students used
figures in giving a demonstration. Some relied heavily on figures; others
used none at all, being content to set down the successive formulae of the
proof. I have noticed that the former type of student usually developed
an interest in the geometric aspects of the subject, following the tradition
of classical topology, while the latter developed greater interest in the new
algebraic aspects of the subject. There may be considerable truth in the
old folklore that some are naturally geometric-minded, while others have
not so much geometric sense but show great facility for algebraic types of
thinking. I don't know of any better way to discover a student's pro-
pensities in these regards, than to give him a course in modern topology on
axiomatic lines.
4. Insistence on rigorous proof, by the students themselves, in accordance
with the ideal type of axiomatic development. I want to emphasize here two
advantages that the axiomatic development offers.
In the first place, I have seen the method rescue potentially creative
mathematicians from oblivion. Without knowing the reason therefor,
they had become discouraged and depressed, having taken course after
AXIOMATICS AND THE DEVELOPMENT OF CREATIVE TALENT 483
course without "catching on" — with no spark of enlightenment. The
reason for this was evidently that their innate desire for clearcut under-
standing and rigor was continually starved in course after course. One can
appreciate the gleam in a student's eye when, provided with the type of
rigor which the axiomatic method affords, he finds his mathematical self
at last ; for the first time, seemingly, he can let his creative powers soar
with a feeling of security. This is truly one of the ways in which creative
talent is discovered.
In the second place, even the average student feels happy about knowing
just what he is allowed to assume, and in the feeling that at last what he is
doing has, in his eyes, an almost perfect degree of validity. I can illustrate
by an example here. I was once giving a course in the structure of the real
number system, using a system of axioms and the "Moore method". In
the class was a man who had virtually completed his graduate work and
was writing his dissertation in the field of analytic functions. At the end of
the course he came to me and said, "You know, I feel now for the first
time in my life, that I really understand the theory of real functions". I
knew what had happened to him. Despite all his courses and reading in
function theory, he had never felt quite at ease in the domain of real
numbers. Now he felt that, having been thrown wholly on his own
resourses, he had come to grips with the most fundamental properties of
the real number system, and could, so to speak, "look a set ot real num-
bers in the eye!"
5. Encouragement of a good-natured competition. I have found that an
interesting by-play often developed between students, either to see who
could first obtain the proof of a theorem, or failing that, who could give
the most elegant proof. I presume this is a foretaste of the situation in
which the seasoned mathematician often finds himself. I hardly need to
cite historic instances to an audience like this; instances in which a
settlement of a long outstanding problem was clearly in the offing, and
the experts were vying with one another to see who would be the first to
achieve the solution. This always adds zest to the game of mathematics,
either on the elementary level or on the professional level. And no system
of teaching lends itself better to this sort of thing than the one I am
discussing.
There is also the possibility that an original-minded student will
discover a new and more elegant proof of a classical theorem. I have had
this happen several times, and on at least one occasion, to which I shall
484 R. L. WILDER
refer again below, the proof given failed to use one of the conditions stated
in the classical hypothesis, so that a new and stronger theorem resulted.
6. Emphasis on method, not on subject matter. When one lectures, or
uses a text, the student is frequently presented with a theorem and then
given its proof before he has had time to digest the full meaning of the
theorem. And by the time he has struggled through the proof presented,
he has been utterly prejudiced in favor of the methods used. They are all
that will occur to him, as a rule. Use of the axiomatic method with the
student providing his own proof, forces an acquaintance with the meaning
of the theorem, and a decision on a method of proof. I have continually
in my classes, whenever existence proofs were demanded, urged the
students to find constructive methods whenever possible. In this way, I
have had presented to me constructive proofs in instances where I did not
theretofore know that such proofs could be given.
In short, use of the axiomatic method not only encourages the student
to develop his own creative powers, but sometimes leads to the invention
of new methods not previously conceived.
There is one other feature of the method as Moore used it that I have
omitted above, for various reasons, chiefly because of the vagueness of its
terms and the debatability of any interpretation of it:
7. Selection of material best suited to the method. It is probably wisest to
select certain special subjects which seem best suited for the avowed
purposes of discovering and developing creative ability. For example, one
might select material that presupposes little in the way of special tech-
niques (as, for instance, the techniques of classical analysis), but that does
require that ability to think abstractly which should be a characteristic
of the mature student of mathematics, and which requires little intuitive
background. The material which Moore chose was of this nature ; another
such selection might be the theory of the linear continuum.
In the case of the material which Moore selected, the student was led
quickly to the frontiers of knowledge ; that is, to the point where he might
soon be doing original research. I think this aspect of his method is not,
however, essential to its success in developing creative talent. As Moore
used the method, the line between what was known and what was un-
known was not revealed to the students. Customarily they were not
apprised of the source of the axioms or the theorems; for all they knew,
these had probably never been published. And he could go on with them
AXIOMATICS AND THE DEVELOPMENT OF CREATIVE TALENT 485
to unsolved problems through the device of continuing to state theorems
whose validity he might not himself have settled, without their ever being
aware of the fact.
Consequently, so far as this item 7 is concerned, I would say that the
important aspect of it is the selection of material requiring little intuitive
background and presupposing mathematical maturity but little tech-
nique. The techniques of deduction, proof, and of discovering new theo-
rems are naturally part of the design of the course ; the axiomatic method
is ideal for the development of these, and they should be given priority
over the quantity of material covered.
The justification for the system is of course its success. It soon reveals
to both teacher and student whether or not the latter possesses mathe-
matical talent. It quickly selects those who have the "gift", so to speak,
and develops their creative powers in a way that no other method ever
succeeded in doing. Every mathematician, now and in the past, has
recognized the necessity for doing mathematics, not just reading it, and
has assigned "originals" for the student to do on his own. In the Moore
system, we find the "original" par excellence — there is nothing in the
course but originals! I should repeat, in connection with these remarks,
that it is not unusual for a student to find a new proof of a known theorem
that deserves publication, as well as for new theorems to be found. I had
one outstanding case of this in my own use of the method, where the new
proof showed one could dispense with part of the traditional hypothesis ;
and I had the student go on to incorporate his methods into proving
another and similar theorem which was historically related to the former
and was susceptible to the same improvement in its hypothesis.
The fact that what the logician would call the "naive" axiomatic
method is used, does not seem to cause any objection from the student.
In fact, I am afraid that a strict formalism might not work so well;
although this is debatable, and certainly a carefully formulated proof
theory would be quite adaptable to certain types of material. The use of a
"natural" language throughout, except for the technical undefined terms,
was, however, an important feature of the method as Moore used it, not
only aiding the intuition but enabling that competition mentioned in
item 5 to "wax hot" at crucial moments.
This brings me to some remarks about an area of teaching in which
tradition is most strong, namely the undergraduate curriculum. Today we
hear a great deal about encouraging the young student to go into a
486 R. L. WILDER
mathematical or scientific career. Unfortunately much potential creative
talent is lost to mathematics early in the undergraduate training, and
much of this, I am sure, is due to traditional modes of presentation. It is
possible that the axiomatic approach offers at least a partial solution of
this problem.
The axiomatic method in the undergraduate course. As Moore
used the axiomatic method for teaching on the graduate level, the aim was
to discover and develop creative ability. Is there not a possibility that the
method could be employed to advantage at a lower level, so that the
potentially creative mathematician will be encouraged to continue in
mathematics to the point where his talents can be more decisively put to
the test?
I am convinced that one o{ our greatest errors in the United States
educational system has been to underestimate the ability of the young
student to think abstractly. Moreover, I am convinced that as a result,
we actually force him to think "realistically" where actually he would
prefer to think abstractly, so that by the time he begins graduate work,
his ability to abstract has been so dulled that we have to try to develop
it anew.
It seems probable that we could try using the axiomatic method on a
lower level, perhaps even on the freshman level, at selected points where
the material is of a suitable kind. In the interests of caution, perhaps we
should experiment on picked groups first, as well as with carefully
selected material. It is possible that we might light creative sparks where,
with the conventional type of teaching, no light would ever dawn. Some
years ago I had a chance to do this sort of thing, with a picked group of
around ten students. I did not have an opportunity to teach most of these
students again until they became graduates. But I am happy to state
that a majority of them went on to the doctorate — not necessarily in
mathematics, for some turned to physics -— but at least they went on into
creative work. I don't wish to give myself credit here; it is the method that
deserves the credit. These men discovered unsuspected powers in them-
selves, and could not resist cultivating and exercising them further.
Moreover, I found they were delighted at being able to establish their
ideas on a rigorous basis. For example, in starting the calculus, I gave
them precise definitions, etc., for a foundation of the theory of limits in
the real number system, and let them establish rigorously on this foun-
dation all the properties of limits needed in the calculus. The result was
AXIOMATICS AND THE DEVELOPMENT OF CREATIVE TALENT 487
that they covered the calculus in about half the time ordinarily required .
Admittedly some of this saving in time was due to the select nature of the
class, but a major part, I am convinced, was due to the confidence and
interest induced by establishing the theory of limits on a firm basis.
In the presidential address of E. H. Moore to which I referred in my
introduction, he stressed the advisability of mixing the real and the
abstract in the teaching of mathematics in the secondary schools. But
(and here I quote from E. H. Moore's address, p. 416) " — when it comes
to the beginning of the more formal deductive geometry why should not
the students be directed each for himself to set forth a body of geometric
fundamental principles, on which he would proceed to erect his geomet-
rical edifice? This method would be thoroughly practical and at the
same time thoroughly scientific. The various students would have differ-
ent systems of axioms, and the discussion thus arising naturally would
make clearer in the minds of all precisely what are the functions of the
axioms in the theory of geometry." Here was evidently a suggestion for
the creative use of axiomatics at the high school level.
There are currently experiments being conducted in some under-
graduate colleges which are based on modifications of the methods Moore
used. For example, I know of one case 7 where a special course of this
kind, for freshmen, has been devised. One-half the course is spent esta-
blishing arithmetic, on an axiomatic basis. The numbers 0, 1,2, etc. are
used, but the development is rigorous, and indeed approaches the rigor
of a formal system in that the ndes for proof are explicitly set forth. By
the use of variables, the student is led gradually into algebra, which
occupies most of the latter half of the course. The course terminates in an
analysis, based on truth tables, of the formal logic to which the student
has gradually become accustomed during the course. I judge that one of
the reasons for the success which the course seems to have achieved is
that the student is made aware of the reasons for the various arithmetic
manipulations in which he was disciplined in the elementary schools ; as,
for instance, why one inverts and multiplies in order to divide by a
fraction. This course has, incidentally, revealed that students who do not
do well on their placement examinations are not necessarily laggards,
weak-minded, or susceptible of any of the other easy explanations, but
that they often are intelligent, capable persons who have been antagonized
by traditional drill methods. Moreover, some of these students are induced
by the course into going further in mathematics. I believe this course is
7 At the University of Miami.
488 R. L. WILDER
still in a developmental stage, and I await with interest reports on its
effectiveness. One gets the feeling from reading the text used that the
student is being treated with trust, as naturally curious to know the why
of what he is doing, and as being intelligent enough to find out if permitted !
During the past few years there has been published a number of ele-
mentary texts which use the axiomatic method to some extent. Perhaps
this is a sign of a trend. I hope that in my remarks I have not over-
emphasized to such an extent as to give an impression that I think the
axiomatic method is a cure-all. I do not think so. Nor do I think it
desirable that all courses should be axiomatized! But I believe that the
great advances that the method has made in mathematical research
during the past 50 years can, to a considerable extent, find a parallel in
the teaching of mathematics, and that its wise and strategic use, at special
times along the line from elementary teaching to the first contacts with
the frontiers of mathematics, will result in the discovery and development
of much creative talent that is now lost to mathematics.
Bibliography
[1] ARCHIBALD, R. C., A semicentennial history of the American Mathematical So-
ciety 1888-1938. American Mathematical Society Semicentennial Publications,
vol. 1, New York 1938, V -f 262 pp.
[2] MOORE. E. H., On the foundations of mathematics. Bulletin of the American
Mathematical Society, vol. 9 (1902-03), pp. 402-424.
[3] MOORE, R. L., On the foundations of plane analysis situs. Transactions of the
American Mathematical Society, vol. 17 (1916), pp. 131-164.
[4] , Concerning a set of postulates for plane analysis situs. Transactions of the
American Mathematical Society, vol. 20 (1919), pp. 169-178.
[5] 1 Concerning upper semi-continuous collections of continua. Transactions of
the American Mathematical Society, vol. 27 (1925), pp. 416-428.
[6] POINCARE, H., The foundations of science. Lancaster, Pa., 1946, XI -f 553 pp.
[7] WEYL, H., Emmy Noether. Scripta Mathematica, vol. 3 (1935), pp. 1-20.
[8] WILDER, R. L., Concerning R. L. Moore's axioms Z\ for plane analysis situs.
Bulletin of the American Mathematical Society, vol. 34 (1928), pp. 752-760.